Disambiguation in note styles

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Disambiguation in note styles

fbennett
A note on some fresh developments in citeproc-js land that affect the CSL test suite.

In response to feedback on the MLZ Bluebook style, I put in some work in citeproc-js to get backreference glosses working. The form implemented for Bluebook support looks like this:

  Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]

The tricky bits are that (a) the gloss should be applied only if there are subsequent back-references; and (b) the note number should be included for disambiguation purposes. A test that captures the behaviour is here:
A question for the list is whether first-reference-note-number should always be included for disambiguation purposes, or whether it should be discretionary. In the current implementation, it is included only if givenname-disambiguation-rule="by-cite" (the default). When another rule is used, the cite to Roe in the test fixture linked above would have the gloss, and the backreference would show the title.

While the name of givenname-disambiguation-rule suggests that it affects only given names, the general effect of the "by-cite" rule is to make citations as compact as possible; and dropping the gloss where is is not strictly necessary has that effect.

While testing the implementation, I found it necessary, in styles that use disambiguate="true", to force a rerun of disambiguation for first references that are moved in the document, together with all back-references that point to it, to assure that the document reflects the actual disambiguation state of each reference in the set. This change in behaviour affected three tests in the test suite:

    https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386

Finally, to control the appearance of the gloss on first references, I had to introduce a test condition (which I've added to the CSL-m schema) that returns true only if there are subsequent references to the item:

    https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b

A condition that tests for subsequent back-references is needed to implement back-reference glosses, regardless of whether note numbers are included for disambiguation purposes.

Frank


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

Sebastian Karcher
sorry, I feel like I'm missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?
Also, not all styles will have givenname disambiguation at all, so
it's very much possible to have a style _without_ a givenname
disambiguation rule. Generally I'm not happy using that rule for
anything but givennames—that's just going to create chaos.

I'm probably missing something, but I'm going to guess that I've spend
more time on this than most other, so if I don't get it, probably a
lot of others won't, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

On Wed, Jun 11, 2014 at 6:31 PM, Frank Bennett <[hidden email]> wrote:

> A note on some fresh developments in citeproc-js land that affect the CSL
> test suite.
>
> In response to feedback on the MLZ Bluebook style, I put in some work in
> citeproc-js to get backreference glosses working. The form implemented for
> Bluebook support looks like this:
>
>   Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]
>
> The tricky bits are that (a) the gloss should be applied only if there are
> subsequent back-references; and (b) the note number should be included for
> disambiguation purposes. A test that captures the behaviour is here:
>
> https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt
>
> A question for the list is whether first-reference-note-number should always
> be included for disambiguation purposes, or whether it should be
> discretionary. In the current implementation, it is included only if
> givenname-disambiguation-rule="by-cite" (the default). When another rule is
> used, the cite to Roe in the test fixture linked above would have the gloss,
> and the backreference would show the title.
>
> While the name of givenname-disambiguation-rule suggests that it affects
> only given names, the general effect of the "by-cite" rule is to make
> citations as compact as possible; and dropping the gloss where is is not
> strictly necessary has that effect.
>
> While testing the implementation, I found it necessary, in styles that use
> disambiguate="true", to force a rerun of disambiguation for first references
> that are moved in the document, together with all back-references that point
> to it, to assure that the document reflects the actual disambiguation state
> of each reference in the set. This change in behaviour affected three tests
> in the test suite:
>
>
> https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386
>
> Finally, to control the appearance of the gloss on first references, I had
> to introduce a test condition (which I've added to the CSL-m schema) that
> returns true only if there are subsequent references to the item:
>
>
> https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b
>
> A condition that tests for subsequent back-references is needed to implement
> back-reference glosses, regardless of whether note numbers are included for
> disambiguation purposes.
>
> Frank
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>



--
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

fbennett
On Thu, Jun 12, 2014 at 10:34 AM, Sebastian Karcher <[hidden email]> wrote:
sorry, I feel like I'm missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?

Disambiguation is just hard, and I may have mistated some things (as well as being unclear). Let's take the points in order.

Disambiguation settings are a property of the disambiguation "pool" of which the item (not the specific cite) is a member. Members of a pool are those that render identically with all disambiguation settings turned off.

In the current processor version running in Zotero, ambiguity can be determined by comparing bare items, without regard to their position in the document. As a result, everything works fine.

When note numbers are included in the comparison, the relative position of two cites becomes relevant to the disambiguation comparison. If cites are in separate notes, they are always unambiguous:

1. Smith, Book A (1999)
2. Smith, Book B (2000)
3. Jones, Other Book (2014)
4. Smith, supra note 2.

In this example, we know which work by Smith is intended in note 4, because it points to a note that contains only one item by Smith.

-----
Scenario 1
-----

If note 2 in the example above is deleted, and the Book B reference is added to note 1, the title must be added to the subsequent reference:

1. Smith, Book A (1999); Smith, Book B (2000)
2. Jones, Other Book (2014)
3. Smith, Book B, supra note 1.

In the current version of the processor in Zotero, disambiguation evaluation is performed only when items are inserted or (entirely) removed. Therefore, the insertion of Book B to note 1 does not trigger reevaluation of disambiguation parameters, and the title ("Book B") is not added; only the note number will change, as a result of rerendering the cite, with the same disambig parameters, using refreshed input data.

The solution in this case is to rerun disambiguation of all partners in the disambiguation set (i.e. the Book A and Book B items).

You are right that changes below the first reference will not affect the cites in note 1, in this example. Avoiding an update to first-position cites when it is not strictly necessary is not quite so simple as it may sound, however. A further example may help to illustrate.

-----
Scenario 1
-----

Let's start with the same arrangement as Scenario 1, but with a hereinafter gloss following the ambiguous first-reference cite, as required by some styles:

1. Smith, Book A (1999); Smith, Book B (2000) [hereinafter Smith, Book B]
2. Jones, Other Book (2014)
3. Smith, Book B, supra note 1.

If note 3 is edited to remove Book B, the gloss becomes unnecessary. The cites should look like this:

1. Smith, Book A (1999); Smith, Book B (2000)
2. Jones, Other Book (2014)
3. Ibid.

In this case, an update to note 1 is required when the Book B reference is removed. This is not required in the previous case, but to determine when it is and is not required, we must identify whether a disambiguate="true" condition will be encountered when it is rerendered. Given the potential complexity of condition statements, the simplest way to do that is to rerun all cites in the pool.

There may be some scope for reducing this small overhead without breaking things, but it seems sensible to start with a procedure that is known to work.

Anyway, that's the thinking there.

 
Also, not all styles will have givenname disambiguation at all, so
it's very much possible to have a style _without_ a givenname
disambiguation rule. Generally I'm not happy using that rule for
anything but givennames—that's just going to create chaos.

I'm probably missing something, but I'm going to guess that I've spend
more time on this than most other, so if I don't get it, probably a
lot of others won't, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

On Wed, Jun 11, 2014 at 6:31 PM, Frank Bennett <[hidden email]> wrote:
> A note on some fresh developments in citeproc-js land that affect the CSL
> test suite.
>
> In response to feedback on the MLZ Bluebook style, I put in some work in
> citeproc-js to get backreference glosses working. The form implemented for
> Bluebook support looks like this:
>
>   Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]
>
> The tricky bits are that (a) the gloss should be applied only if there are
> subsequent back-references; and (b) the note number should be included for
> disambiguation purposes. A test that captures the behaviour is here:
>
> https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt
>
> A question for the list is whether first-reference-note-number should always
> be included for disambiguation purposes, or whether it should be
> discretionary. In the current implementation, it is included only if
> givenname-disambiguation-rule="by-cite" (the default). When another rule is
> used, the cite to Roe in the test fixture linked above would have the gloss,
> and the backreference would show the title.
>
> While the name of givenname-disambiguation-rule suggests that it affects
> only given names, the general effect of the "by-cite" rule is to make
> citations as compact as possible; and dropping the gloss where is is not
> strictly necessary has that effect.
>
> While testing the implementation, I found it necessary, in styles that use
> disambiguate="true", to force a rerun of disambiguation for first references
> that are moved in the document, together with all back-references that point
> to it, to assure that the document reflects the actual disambiguation state
> of each reference in the set. This change in behaviour affected three tests
> in the test suite:
>
>
> https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386
>
> Finally, to control the appearance of the gloss on first references, I had
> to introduce a test condition (which I've added to the CSL-m schema) that
> returns true only if there are subsequent references to the item:
>
>
> https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b
>
> A condition that tests for subsequent back-references is needed to implement
> back-reference glosses, regardless of whether note numbers are included for
> disambiguation purposes.
>
> Frank
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>



--
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

fbennett
(oops. the second example should be Scenario 2, of course.)


On Thu, Jun 12, 2014 at 12:23 PM, Frank Bennett <[hidden email]> wrote:
On Thu, Jun 12, 2014 at 10:34 AM, Sebastian Karcher <[hidden email]> wrote:
sorry, I feel like I'm missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?

Disambiguation is just hard, and I may have mistated some things (as well as being unclear). Let's take the points in order.

Disambiguation settings are a property of the disambiguation "pool" of which the item (not the specific cite) is a member. Members of a pool are those that render identically with all disambiguation settings turned off.

In the current processor version running in Zotero, ambiguity can be determined by comparing bare items, without regard to their position in the document. As a result, everything works fine.

When note numbers are included in the comparison, the relative position of two cites becomes relevant to the disambiguation comparison. If cites are in separate notes, they are always unambiguous:

1. Smith, Book A (1999)
2. Smith, Book B (2000)
3. Jones, Other Book (2014)
4. Smith, supra note 2.

In this example, we know which work by Smith is intended in note 4, because it points to a note that contains only one item by Smith.

-----
Scenario 1
-----

If note 2 in the example above is deleted, and the Book B reference is added to note 1, the title must be added to the subsequent reference:

1. Smith, Book A (1999); Smith, Book B (2000)
2. Jones, Other Book (2014)
3. Smith, Book B, supra note 1.

In the current version of the processor in Zotero, disambiguation evaluation is performed only when items are inserted or (entirely) removed. Therefore, the insertion of Book B to note 1 does not trigger reevaluation of disambiguation parameters, and the title ("Book B") is not added; only the note number will change, as a result of rerendering the cite, with the same disambig parameters, using refreshed input data.

The solution in this case is to rerun disambiguation of all partners in the disambiguation set (i.e. the Book A and Book B items).

You are right that changes below the first reference will not affect the cites in note 1, in this example. Avoiding an update to first-position cites when it is not strictly necessary is not quite so simple as it may sound, however. A further example may help to illustrate.

-----
Scenario 1
-----

Let's start with the same arrangement as Scenario 1, but with a hereinafter gloss following the ambiguous first-reference cite, as required by some styles:

1. Smith, Book A (1999); Smith, Book B (2000) [hereinafter Smith, Book B]
2. Jones, Other Book (2014)
3. Smith, Book B, supra note 1.

If note 3 is edited to remove Book B, the gloss becomes unnecessary. The cites should look like this:

1. Smith, Book A (1999); Smith, Book B (2000)
2. Jones, Other Book (2014)
3. Ibid.

In this case, an update to note 1 is required when the Book B reference is removed. This is not required in the previous case, but to determine when it is and is not required, we must identify whether a disambiguate="true" condition will be encountered when it is rerendered. Given the potential complexity of condition statements, the simplest way to do that is to rerun all cites in the pool.

There may be some scope for reducing this small overhead without breaking things, but it seems sensible to start with a procedure that is known to work.

Anyway, that's the thinking there.

 
Also, not all styles will have givenname disambiguation at all, so
it's very much possible to have a style _without_ a givenname
disambiguation rule. Generally I'm not happy using that rule for
anything but givennames—that's just going to create chaos.

I'm probably missing something, but I'm going to guess that I've spend
more time on this than most other, so if I don't get it, probably a
lot of others won't, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

On Wed, Jun 11, 2014 at 6:31 PM, Frank Bennett <[hidden email]> wrote:
> A note on some fresh developments in citeproc-js land that affect the CSL
> test suite.
>
> In response to feedback on the MLZ Bluebook style, I put in some work in
> citeproc-js to get backreference glosses working. The form implemented for
> Bluebook support looks like this:
>
>   Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]
>
> The tricky bits are that (a) the gloss should be applied only if there are
> subsequent back-references; and (b) the note number should be included for
> disambiguation purposes. A test that captures the behaviour is here:
>
> https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt
>
> A question for the list is whether first-reference-note-number should always
> be included for disambiguation purposes, or whether it should be
> discretionary. In the current implementation, it is included only if
> givenname-disambiguation-rule="by-cite" (the default). When another rule is
> used, the cite to Roe in the test fixture linked above would have the gloss,
> and the backreference would show the title.
>
> While the name of givenname-disambiguation-rule suggests that it affects
> only given names, the general effect of the "by-cite" rule is to make
> citations as compact as possible; and dropping the gloss where is is not
> strictly necessary has that effect.
>
> While testing the implementation, I found it necessary, in styles that use
> disambiguate="true", to force a rerun of disambiguation for first references
> that are moved in the document, together with all back-references that point
> to it, to assure that the document reflects the actual disambiguation state
> of each reference in the set. This change in behaviour affected three tests
> in the test suite:
>
>
> https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386
>
> Finally, to control the appearance of the gloss on first references, I had
> to introduce a test condition (which I've added to the CSL-m schema) that
> returns true only if there are subsequent references to the item:
>
>
> https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b
>
> A condition that tests for subsequent back-references is needed to implement
> back-reference glosses, regardless of whether note numbers are included for
> disambiguation purposes.
>
> Frank
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>



--
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

fbennett
In reply to this post by Sebastian Karcher
On Thu, Jun 12, 2014 at 10:34 AM, Sebastian Karcher <[hidden email]> wrote:
sorry, I feel like I'm missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?
Also, not all styles will have givenname disambiguation at all, so
it's very much possible to have a style _without_ a givenname
disambiguation rule. Generally I'm not happy using that rule for
anything but givennames—that's just going to create chaos.

"Chaos" is a little strong, surely; but I understand your reservation, and that's why I explained the rationale for the choice: it's certainly not carved in stone or anything.

Since givenname-disambiguation-rule="by-cite" is the default behaviour, the default behaviour here would be to include first-reference note numbers when disambiguating. That would be toggled off when givenname-disambiguation-rule is set to some other value.

Alternatively, the setting could easily be given its own attribute. All that would be needed is to decide what it should be called, and what its default value would be. A total of nine independent repository styles use first-reference-note-number and perform disambiguation of some sort, so the set is pretty limited -- mostly legal styles. From a quick look at available documentation, I'd say that guides are generally unclear on what exactly is meant by "ambiguity", but flexibility is a good thing, so I'll revise and suggest a solo attribute "disambiguate-on-first-reference-note-number", with a default value of "false".



I'm probably missing something, but I'm going to guess that I've spend
more time on this than most other, so if I don't get it, probably a
lot of others won't, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

The test case shows what the code needs to accomplish.

As CSL revisions are not an issue at the moment, I'm just posting this so that the details will be on file when the design cycle rolls around again. There isn't any pressure to make decisions about it in the short term.

Frank


 

On Wed, Jun 11, 2014 at 6:31 PM, Frank Bennett <[hidden email]> wrote:
> A note on some fresh developments in citeproc-js land that affect the CSL
> test suite.
>
> In response to feedback on the MLZ Bluebook style, I put in some work in
> citeproc-js to get backreference glosses working. The form implemented for
> Bluebook support looks like this:
>
>   Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]
>
> The tricky bits are that (a) the gloss should be applied only if there are
> subsequent back-references; and (b) the note number should be included for
> disambiguation purposes. A test that captures the behaviour is here:
>
> https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt
>
> A question for the list is whether first-reference-note-number should always
> be included for disambiguation purposes, or whether it should be
> discretionary. In the current implementation, it is included only if
> givenname-disambiguation-rule="by-cite" (the default). When another rule is
> used, the cite to Roe in the test fixture linked above would have the gloss,
> and the backreference would show the title.
>
> While the name of givenname-disambiguation-rule suggests that it affects
> only given names, the general effect of the "by-cite" rule is to make
> citations as compact as possible; and dropping the gloss where is is not
> strictly necessary has that effect.
>
> While testing the implementation, I found it necessary, in styles that use
> disambiguate="true", to force a rerun of disambiguation for first references
> that are moved in the document, together with all back-references that point
> to it, to assure that the document reflects the actual disambiguation state
> of each reference in the set. This change in behaviour affected three tests
> in the test suite:
>
>
> https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386
>
> Finally, to control the appearance of the gloss on first references, I had
> to introduce a test condition (which I've added to the CSL-m schema) that
> returns true only if there are subsequent references to the item:
>
>
> https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b
>
> A condition that tests for subsequent back-references is needed to implement
> back-reference glosses, regardless of whether note numbers are included for
> disambiguation purposes.
>
> Frank
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>



--
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

Bruce D'Arcus-3
In reply to this post by fbennett

I get confused by long messages ;-)

Can we start with big picture? Please confirm the following:

1) this question is driven by the idiosyncrasies of supra referencing?*

2) CSL doesn't currently support supra referencing, and so this is an extension in MLZ?

If both are true, perhaps you should go with your suggestion (which seems reasonable), see how it works, and use that experience to suggest possible additions or changes to CSL proper?

Bruce

* I do know this is an important feature, but man I hate it; not only is it a PITA to implement, it's hostile to readers (me!). One additional wrinkle here related to both: what's the scope for back referencing in  600 page book? The book? The chapter? The page? Do you need to allow this to be configured? If yes, how would you even implement it. ;-)

On Jun 11, 2014 8:31 PM, "Frank Bennett" <[hidden email]> wrote:
A note on some fresh developments in citeproc-js land that affect the CSL test suite.

In response to feedback on the MLZ Bluebook style, I put in some work in citeproc-js to get backreference glosses working. The form implemented for Bluebook support looks like this:

  Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]

The tricky bits are that (a) the gloss should be applied only if there are subsequent back-references; and (b) the note number should be included for disambiguation purposes. A test that captures the behaviour is here:
A question for the list is whether first-reference-note-number should always be included for disambiguation purposes, or whether it should be discretionary. In the current implementation, it is included only if givenname-disambiguation-rule="by-cite" (the default). When another rule is used, the cite to Roe in the test fixture linked above would have the gloss, and the backreference would show the title.

While the name of givenname-disambiguation-rule suggests that it affects only given names, the general effect of the "by-cite" rule is to make citations as compact as possible; and dropping the gloss where is is not strictly necessary has that effect.

While testing the implementation, I found it necessary, in styles that use disambiguate="true", to force a rerun of disambiguation for first references that are moved in the document, together with all back-references that point to it, to assure that the document reflects the actual disambiguation state of each reference in the set. This change in behaviour affected three tests in the test suite:

    https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386

Finally, to control the appearance of the gloss on first references, I had to introduce a test condition (which I've added to the CSL-m schema) that returns true only if there are subsequent references to the item:

    https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b

A condition that tests for subsequent back-references is needed to implement back-reference glosses, regardless of whether note numbers are included for disambiguation purposes.

Frank


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

fbennett
On Thu, Jun 12, 2014 at 9:52 PM, Bruce D'Arcus <[hidden email]> wrote:

I get confused by long messages ;-)

Can we start with big picture? Please confirm the following:

1) this question is driven by the idiosyncrasies of supra referencing?*

Yes.
 

2) CSL doesn't currently support supra referencing, and so this is an extension in MLZ?


It does, actually: first-reference-note-number is one of the standard CSL variables, from CSL 1.0:

    http://citationstyles.org/downloads/specification.html#standard-variables

CSL also supports the "five-footnote rule" imposed by some legal styles:

    http://citationstyles.org/downloads/specification.html#note-distance
 

If both are true, perhaps you should go with your suggestion (which seems reasonable), see how it works, and use that experience to suggest possible additions or changes to CSL proper?

Yep!
 

Bruce

* I do know this is an important feature, but man I hate it; not only is it a PITA to implement, it's hostile to readers (me!). One additional wrinkle here related to both: what's the scope for back referencing in  600 page book? The book? The chapter? The page? Do you need to allow this to be configured? If yes, how would you even implement it. ;-)

It's used almost exclusively in article-length works, scoped to the individual article.

 
On Jun 11, 2014 8:31 PM, "Frank Bennett" <[hidden email]> wrote:
A note on some fresh developments in citeproc-js land that affect the CSL test suite.

In response to feedback on the MLZ Bluebook style, I put in some work in citeproc-js to get backreference glosses working. The form implemented for Bluebook support looks like this:

  Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]

The tricky bits are that (a) the gloss should be applied only if there are subsequent back-references; and (b) the note number should be included for disambiguation purposes. A test that captures the behaviour is here:
A question for the list is whether first-reference-note-number should always be included for disambiguation purposes, or whether it should be discretionary. In the current implementation, it is included only if givenname-disambiguation-rule="by-cite" (the default). When another rule is used, the cite to Roe in the test fixture linked above would have the gloss, and the backreference would show the title.

While the name of givenname-disambiguation-rule suggests that it affects only given names, the general effect of the "by-cite" rule is to make citations as compact as possible; and dropping the gloss where is is not strictly necessary has that effect.

While testing the implementation, I found it necessary, in styles that use disambiguate="true", to force a rerun of disambiguation for first references that are moved in the document, together with all back-references that point to it, to assure that the document reflects the actual disambiguation state of each reference in the set. This change in behaviour affected three tests in the test suite:

    https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386

Finally, to control the appearance of the gloss on first references, I had to introduce a test condition (which I've added to the CSL-m schema) that returns true only if there are subsequent references to the item:

    https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b

A condition that tests for subsequent back-references is needed to implement back-reference glosses, regardless of whether note numbers are included for disambiguation purposes.

Frank


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Disambiguation in note styles

Sebastian Karcher
In reply to this post by fbennett
sorry if "Chaos" sounded rough--it was just short-hand for a concern
that it might be confusing. I think I understand the issue now and
that makes sense. I'd much rather have that in a separate attribute,
not just because of the naming, but also because I think
systematically the givennname-disambiguation-rule should only apply
when givenname disambiguation is turned on, which may or may not be
the case in the references you're dealing with.

Sebastian

On Thu, Jun 12, 2014 at 6:16 AM, Frank Bennett <[hidden email]> wrote:

> On Thu, Jun 12, 2014 at 10:34 AM, Sebastian Karcher
> <[hidden email]> wrote:
>>
>> sorry, I feel like I'm missing things here.
>> Why would the appearance of first-reference-note-number be contingent
>> on a disambiguation preference unless it is in an if loop testing for
>> disambiguation?
>> Also, not all styles will have givenname disambiguation at all, so
>> it's very much possible to have a style _without_ a givenname
>> disambiguation rule. Generally I'm not happy using that rule for
>> anything but givennames—that's just going to create chaos.
>
>
> "Chaos" is a little strong, surely; but I understand your reservation, and
> that's why I explained the rationale for the choice: it's certainly not
> carved in stone or anything.
>
> Since givenname-disambiguation-rule="by-cite" is the default behaviour, the
> default behaviour here would be to include first-reference note numbers when
> disambiguating. That would be toggled off when givenname-disambiguation-rule
> is set to some other value.
>
> Alternatively, the setting could easily be given its own attribute. All that
> would be needed is to decide what it should be called, and what its default
> value would be. A total of nine independent repository styles use
> first-reference-note-number and perform disambiguation of some sort, so the
> set is pretty limited -- mostly legal styles. From a quick look at available
> documentation, I'd say that guides are generally unclear on what exactly is
> meant by "ambiguity", but flexibility is a good thing, so I'll revise and
> suggest a solo attribute "disambiguate-on-first-reference-note-number", with
> a default value of "false".
>
>
>>
>> I'm probably missing something, but I'm going to guess that I've spend
>> more time on this than most other, so if I don't get it, probably a
>> lot of others won't, either. So maybe you could step back a bit and
>> try to explain again why the the disambiguation rule and the
>> first-note number are involved here?
>
>
> The test case shows what the code needs to accomplish.
>
> As CSL revisions are not an issue at the moment, I'm just posting this so
> that the details will be on file when the design cycle rolls around again.
> There isn't any pressure to make decisions about it in the short term.
>
> Frank
>
>
>
>>
>>
>> On Wed, Jun 11, 2014 at 6:31 PM, Frank Bennett <[hidden email]>
>> wrote:
>> > A note on some fresh developments in citeproc-js land that affect the
>> > CSL
>> > test suite.
>> >
>> > In response to feedback on the MLZ Bluebook style, I put in some work in
>> > citeproc-js to get backreference glosses working. The form implemented
>> > for
>> > Bluebook support looks like this:
>> >
>> >   Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]
>> >
>> > The tricky bits are that (a) the gloss should be applied only if there
>> > are
>> > subsequent back-references; and (b) the note number should be included
>> > for
>> > disambiguation purposes. A test that captures the behaviour is here:
>> >
>> >
>> > https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt
>> >
>> > A question for the list is whether first-reference-note-number should
>> > always
>> > be included for disambiguation purposes, or whether it should be
>> > discretionary. In the current implementation, it is included only if
>> > givenname-disambiguation-rule="by-cite" (the default). When another rule
>> > is
>> > used, the cite to Roe in the test fixture linked above would have the
>> > gloss,
>> > and the backreference would show the title.
>> >
>> > While the name of givenname-disambiguation-rule suggests that it affects
>> > only given names, the general effect of the "by-cite" rule is to make
>> > citations as compact as possible; and dropping the gloss where is is not
>> > strictly necessary has that effect.
>> >
>> > While testing the implementation, I found it necessary, in styles that
>> > use
>> > disambiguate="true", to force a rerun of disambiguation for first
>> > references
>> > that are moved in the document, together with all back-references that
>> > point
>> > to it, to assure that the document reflects the actual disambiguation
>> > state
>> > of each reference in the set. This change in behaviour affected three
>> > tests
>> > in the test suite:
>> >
>> >
>> >
>> > https://bitbucket.org/bdarcus/citeproc-test/commits/737afd7171005f9d53cf221f8a71f21007d10386
>> >
>> > Finally, to control the appearance of the gloss on first references, I
>> > had
>> > to introduce a test condition (which I've added to the CSL-m schema)
>> > that
>> > returns true only if there are subsequent references to the item:
>> >
>> >
>> >
>> > https://github.com/fbennett/schema/commit/6881c98ae752e106b9c62673c5aa42d743fc0c7b
>> >
>> > A condition that tests for subsequent back-references is needed to
>> > implement
>> > back-reference glosses, regardless of whether note numbers are included
>> > for
>> > disambiguation purposes.
>> >
>> > Frank
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > HPCC Systems Open Source Big Data Platform from LexisNexis Risk
>> > Solutions
>> > Find What Matters Most in Your Big Data with HPCC Systems
>> > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
>> > Leverages Graph Analysis for Fast Processing & Easy Data Exploration
>> > http://p.sf.net/sfu/hpccsystems
>> > _______________________________________________
>> > xbiblio-devel mailing list
>> > [hidden email]
>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>> >
>>
>>
>>
>> --
>> Sebastian Karcher
>> Ph.D. Candidate
>> Department of Political Science
>> Northwestern University
>>
>>
>> ------------------------------------------------------------------------------
>> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
>> Find What Matters Most in Your Big Data with HPCC Systems
>> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
>> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
>> http://p.sf.net/sfu/hpccsystems
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>



--
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Loading...