non-dropping particles

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

non-dropping particles

Nick Bart
This is to proceed with a discussion started on https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.

While the CSL schema in its current form seems adequate for dealing with non-dropping particles in European and Arabic names, I feel some aspects of interpretation need to be reviewed:

In a nutshell, I argue that “van den”, “al-” and friends are genuine non-dropping particles, but “La” and possibly a few others are not and are best seen as parts of a single multipart last name (just like “Van” in Belgian or American names, e.g., “Van Rompuy”).

The following is copied from https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:

Certain names start with non-dropping particles, where “non-dropping” means these particles have to appear in in-text citations (“van den Keere”, “al-Hakim”) but may or may not be dropped in a bibliography for sorting (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort under “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van den”).

The Chicago Manual clearly recommends the sort-and-display variant (16e: 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL Chicago styles should switch to `demote-non-dropping-particle="display-and-sort"`.

By contrast, any last name that does not function this way, i.e., where elements are never removed from the front for purposes of sorting or display, or in other words, where the last name is always used in one and the same form only throughout a document, both in text and in a bibliography, should be parsed as one multipart last name.

For example, I would argue that “La Fontaine” should be understood, contra the examples given in http://docs.citationstyles.org/en/stable/specification.html, as one single multipart last name, since “Fontaine” never seems to be used alone, neither for sorting nor display (I’ve sometimes seen “Fontaine” used as a crossreference pointing to “La Fontaine”, but that’s nothing currently implemented in CSL anyway).

Parsing such “immutable” last names as multipart last names will most likely take care of all “potential objections to demoting the particle when demote-non-dropping-particle="display-and-sort" is applied for European name formatting” [fbennett] referred to earlier in this thread.

If this seems acceptable so far, it would also mean that some of citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”. Protecting such names by wrapping them in double quotation marks would serve as a workaround, of course.

On the other hand, if a genuine need is felt to have more flexibility, e.g., allowing different settings for demoting various individual groups of non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have to discuss an extension of the CSL schema – but currently I don’t really think that’s necessary.

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

rmzelle
Administrator
I searched around a bit, and I agree that "Jean de La Fontaine" might
not be the best example. Better examples might be "Ludwig van
Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
particle). Then we get:

Display order with "demote-non-dropping-particle" set to “never” or “sort-only”:
"Beethoven, Ludwig van"
"van Gogh, Vincent"

Display order with "demote-non-dropping-particle" set to “display-and-sort”:
"Beethoven, Ludwig van"
"Gogh, Vincent van"

As the example above shows, "van" has an ambiguous particle type and
we thus cannot rely on automatic parsing of two-field name fields
(given and family name) like those used in the Zotero UI to identify
particles and assign them as dropping or non-dropping. The CSL spec
currently doesn't discuss this type of parsing, since it assumes fully
structured metadata. But it's clear that the particle parsing process
is by far the most opaque aspect of Zotero/CSL's particle treatment.
I'm really not a fan of protecting names in double quotation marks. I
think the best option would be for the Zotero UI to be more explicit
about particles, e.g. by offering a multi-part name field (given,
dropping particle, non-dropping particle, family, and suffix).

Rintze

On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]> wrote:

> This is to proceed with a discussion started on
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>
> While the CSL schema in its current form seems adequate for dealing with
> non-dropping particles in European and Arabic names, I feel some aspects of
> interpretation need to be reviewed:
>
> In a nutshell, I argue that “van den”, “al-” and friends are genuine
> non-dropping particles, but “La” and possibly a few others are not and are
> best seen as parts of a single multipart last name (just like “Van” in
> Belgian or American names, e.g., “Van Rompuy”).
>
> The following is copied from
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>
> Certain names start with non-dropping particles, where “non-dropping” means
> these particles have to appear in in-text citations (“van den Keere”,
> “al-Hakim”) but may or may not be dropped in a bibliography for sorting
> (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort under
> “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
> den”).
>
> The Chicago Manual clearly recommends the sort-and-display variant (16e:
> 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL Chicago
> styles should switch to `demote-non-dropping-particle="display-and-sort"`.
>
> By contrast, any last name that does not function this way, i.e., where
> elements are never removed from the front for purposes of sorting or
> display, or in other words, where the last name is always used in one and
> the same form only throughout a document, both in text and in a
> bibliography, should be parsed as one multipart last name.
>
> For example, I would argue that “La Fontaine” should be understood, contra
> the examples given in
> http://docs.citationstyles.org/en/stable/specification.html, as one single
> multipart last name, since “Fontaine” never seems to be used alone, neither
> for sorting nor display (I’ve sometimes seen “Fontaine” used as a
> crossreference pointing to “La Fontaine”, but that’s nothing currently
> implemented in CSL anyway).
>
> Parsing such “immutable” last names as multipart last names will most likely
> take care of all “potential objections to demoting the particle when
> demote-non-dropping-particle="display-and-sort" is applied for European name
> formatting” [fbennett] referred to earlier in this thread.
>
> If this seems acceptable so far, it would also mean that some of
> citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
> Protecting such names by wrapping them in double quotation marks would serve
> as a workaround, of course.
>
> On the other hand, if a genuine need is felt to have more flexibility, e.g.,
> allowing different settings for demoting various individual groups of
> non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have to
> discuss an extension of the CSL schema – but currently I don’t really think
> that’s necessary.
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

aurimas

I agree with Rintze about a more explicit UI and that may come in the future (probably not for 5.0). I would still like to have automatic parsing and have that work correctly 99% of the time. The explicit UI would only be necessary where automatic parsing fails.

On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
I searched around a bit, and I agree that "Jean de La Fontaine" might
not be the best example. Better examples might be "Ludwig van
Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
particle). Then we get:

Display order with "demote-non-dropping-particle" set to “never” or “sort-only”:
"Beethoven, Ludwig van"
"van Gogh, Vincent"

Display order with "demote-non-dropping-particle" set to “display-and-sort”:
"Beethoven, Ludwig van"
"Gogh, Vincent van"

As the example above shows, "van" has an ambiguous particle type and
we thus cannot rely on automatic parsing of two-field name fields
(given and family name) like those used in the Zotero UI to identify
particles and assign them as dropping or non-dropping. The CSL spec
currently doesn't discuss this type of parsing, since it assumes fully
structured metadata. But it's clear that the particle parsing process
is by far the most opaque aspect of Zotero/CSL's particle treatment.
I'm really not a fan of protecting names in double quotation marks. I
think the best option would be for the Zotero UI to be more explicit
about particles, e.g. by offering a multi-part name field (given,
dropping particle, non-dropping particle, family, and suffix).

Rintze

On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]> wrote:
> This is to proceed with a discussion started on
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>
> While the CSL schema in its current form seems adequate for dealing with
> non-dropping particles in European and Arabic names, I feel some aspects of
> interpretation need to be reviewed:
>
> In a nutshell, I argue that “van den”, “al-” and friends are genuine
> non-dropping particles, but “La” and possibly a few others are not and are
> best seen as parts of a single multipart last name (just like “Van” in
> Belgian or American names, e.g., “Van Rompuy”).
>
> The following is copied from
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>
> Certain names start with non-dropping particles, where “non-dropping” means
> these particles have to appear in in-text citations (“van den Keere”,
> “al-Hakim”) but may or may not be dropped in a bibliography for sorting
> (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort under
> “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
> den”).
>
> The Chicago Manual clearly recommends the sort-and-display variant (16e:
> 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL Chicago
> styles should switch to `demote-non-dropping-particle="display-and-sort"`.
>
> By contrast, any last name that does not function this way, i.e., where
> elements are never removed from the front for purposes of sorting or
> display, or in other words, where the last name is always used in one and
> the same form only throughout a document, both in text and in a
> bibliography, should be parsed as one multipart last name.
>
> For example, I would argue that “La Fontaine” should be understood, contra
> the examples given in
> http://docs.citationstyles.org/en/stable/specification.html, as one single
> multipart last name, since “Fontaine” never seems to be used alone, neither
> for sorting nor display (I’ve sometimes seen “Fontaine” used as a
> crossreference pointing to “La Fontaine”, but that’s nothing currently
> implemented in CSL anyway).
>
> Parsing such “immutable” last names as multipart last names will most likely
> take care of all “potential objections to demoting the particle when
> demote-non-dropping-particle="display-and-sort" is applied for European name
> formatting” [fbennett] referred to earlier in this thread.
>
> If this seems acceptable so far, it would also mean that some of
> citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
> Protecting such names by wrapping them in double quotation marks would serve
> as a workaround, of course.
>
> On the other hand, if a genuine need is felt to have more flexibility, e.g.,
> allowing different settings for demoting various individual groups of
> non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have to
> discuss an extension of the CSL schema – but currently I don’t really think
> that’s necessary.
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

aurimas

Though the dropping particle in Rintze's example can already be defined explicitly via first name field, so it doesn't undergo any parsing anyway.

On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]> wrote:

I agree with Rintze about a more explicit UI and that may come in the future (probably not for 5.0). I would still like to have automatic parsing and have that work correctly 99% of the time. The explicit UI would only be necessary where automatic parsing fails.

On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
I searched around a bit, and I agree that "Jean de La Fontaine" might
not be the best example. Better examples might be "Ludwig van
Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
particle). Then we get:

Display order with "demote-non-dropping-particle" set to “never” or “sort-only”:
"Beethoven, Ludwig van"
"van Gogh, Vincent"

Display order with "demote-non-dropping-particle" set to “display-and-sort”:
"Beethoven, Ludwig van"
"Gogh, Vincent van"

As the example above shows, "van" has an ambiguous particle type and
we thus cannot rely on automatic parsing of two-field name fields
(given and family name) like those used in the Zotero UI to identify
particles and assign them as dropping or non-dropping. The CSL spec
currently doesn't discuss this type of parsing, since it assumes fully
structured metadata. But it's clear that the particle parsing process
is by far the most opaque aspect of Zotero/CSL's particle treatment.
I'm really not a fan of protecting names in double quotation marks. I
think the best option would be for the Zotero UI to be more explicit
about particles, e.g. by offering a multi-part name field (given,
dropping particle, non-dropping particle, family, and suffix).

Rintze

On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]> wrote:
> This is to proceed with a discussion started on
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>
> While the CSL schema in its current form seems adequate for dealing with
> non-dropping particles in European and Arabic names, I feel some aspects of
> interpretation need to be reviewed:
>
> In a nutshell, I argue that “van den”, “al-” and friends are genuine
> non-dropping particles, but “La” and possibly a few others are not and are
> best seen as parts of a single multipart last name (just like “Van” in
> Belgian or American names, e.g., “Van Rompuy”).
>
> The following is copied from
> https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>
> Certain names start with non-dropping particles, where “non-dropping” means
> these particles have to appear in in-text citations (“van den Keere”,
> “al-Hakim”) but may or may not be dropped in a bibliography for sorting
> (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort under
> “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
> den”).
>
> The Chicago Manual clearly recommends the sort-and-display variant (16e:
> 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL Chicago
> styles should switch to `demote-non-dropping-particle="display-and-sort"`.
>
> By contrast, any last name that does not function this way, i.e., where
> elements are never removed from the front for purposes of sorting or
> display, or in other words, where the last name is always used in one and
> the same form only throughout a document, both in text and in a
> bibliography, should be parsed as one multipart last name.
>
> For example, I would argue that “La Fontaine” should be understood, contra
> the examples given in
> http://docs.citationstyles.org/en/stable/specification.html, as one single
> multipart last name, since “Fontaine” never seems to be used alone, neither
> for sorting nor display (I’ve sometimes seen “Fontaine” used as a
> crossreference pointing to “La Fontaine”, but that’s nothing currently
> implemented in CSL anyway).
>
> Parsing such “immutable” last names as multipart last names will most likely
> take care of all “potential objections to demoting the particle when
> demote-non-dropping-particle="display-and-sort" is applied for European name
> formatting” [fbennett] referred to earlier in this thread.
>
> If this seems acceptable so far, it would also mean that some of
> citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
> Protecting such names by wrapping them in double quotation marks would serve
> as a workaround, of course.
>
> On the other hand, if a genuine need is felt to have more flexibility, e.g.,
> allowing different settings for demoting various individual groups of
> non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have to
> discuss an extension of the CSL schema – but currently I don’t really think
> that’s necessary.
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

rmzelle
Administrator
How is a regular Zotero user going to discover that that's possible, though?

Rintze

On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
<[hidden email]> wrote:

> Though the dropping particle in Rintze's example can already be defined
> explicitly via first name field, so it doesn't undergo any parsing anyway.
>
> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]>
> wrote:
>>
>> I agree with Rintze about a more explicit UI and that may come in the
>> future (probably not for 5.0). I would still like to have automatic parsing
>> and have that work correctly 99% of the time. The explicit UI would only be
>> necessary where automatic parsing fails.
>>
>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
>>>
>>> I searched around a bit, and I agree that "Jean de La Fontaine" might
>>> not be the best example. Better examples might be "Ludwig van
>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>>> particle). Then we get:
>>>
>>> Display order with "demote-non-dropping-particle" set to “never” or
>>> “sort-only”:
>>> "Beethoven, Ludwig van"
>>> "van Gogh, Vincent"
>>>
>>> Display order with "demote-non-dropping-particle" set to
>>> “display-and-sort”:
>>> "Beethoven, Ludwig van"
>>> "Gogh, Vincent van"
>>>
>>> As the example above shows, "van" has an ambiguous particle type and
>>> we thus cannot rely on automatic parsing of two-field name fields
>>> (given and family name) like those used in the Zotero UI to identify
>>> particles and assign them as dropping or non-dropping. The CSL spec
>>> currently doesn't discuss this type of parsing, since it assumes fully
>>> structured metadata. But it's clear that the particle parsing process
>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>>> I'm really not a fan of protecting names in double quotation marks. I
>>> think the best option would be for the Zotero UI to be more explicit
>>> about particles, e.g. by offering a multi-part name field (given,
>>> dropping particle, non-dropping particle, family, and suffix).
>>>
>>> Rintze
>>>
>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>>> wrote:
>>> > This is to proceed with a discussion started on
>>> >
>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>>> >
>>> > While the CSL schema in its current form seems adequate for dealing
>>> > with
>>> > non-dropping particles in European and Arabic names, I feel some
>>> > aspects of
>>> > interpretation need to be reviewed:
>>> >
>>> > In a nutshell, I argue that “van den”, “al-” and friends are genuine
>>> > non-dropping particles, but “La” and possibly a few others are not and
>>> > are
>>> > best seen as parts of a single multipart last name (just like “Van” in
>>> > Belgian or American names, e.g., “Van Rompuy”).
>>> >
>>> > The following is copied from
>>> >
>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>>> >
>>> > Certain names start with non-dropping particles, where “non-dropping”
>>> > means
>>> > these particles have to appear in in-text citations (“van den Keere”,
>>> > “al-Hakim”) but may or may not be dropped in a bibliography for sorting
>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort
>>> > under
>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
>>> > den”).
>>> >
>>> > The Chicago Manual clearly recommends the sort-and-display variant
>>> > (16e:
>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>>> > Chicago
>>> > styles should switch to
>>> > `demote-non-dropping-particle="display-and-sort"`.
>>> >
>>> > By contrast, any last name that does not function this way, i.e., where
>>> > elements are never removed from the front for purposes of sorting or
>>> > display, or in other words, where the last name is always used in one
>>> > and
>>> > the same form only throughout a document, both in text and in a
>>> > bibliography, should be parsed as one multipart last name.
>>> >
>>> > For example, I would argue that “La Fontaine” should be understood,
>>> > contra
>>> > the examples given in
>>> > http://docs.citationstyles.org/en/stable/specification.html, as one
>>> > single
>>> > multipart last name, since “Fontaine” never seems to be used alone,
>>> > neither
>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>>> > crossreference pointing to “La Fontaine”, but that’s nothing currently
>>> > implemented in CSL anyway).
>>> >
>>> > Parsing such “immutable” last names as multipart last names will most
>>> > likely
>>> > take care of all “potential objections to demoting the particle when
>>> > demote-non-dropping-particle="display-and-sort" is applied for European
>>> > name
>>> > formatting” [fbennett] referred to earlier in this thread.
>>> >
>>> > If this seems acceptable so far, it would also mean that some of
>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
>>> > Protecting such names by wrapping them in double quotation marks would
>>> > serve
>>> > as a workaround, of course.
>>> >
>>> > On the other hand, if a genuine need is felt to have more flexibility,
>>> > e.g.,
>>> > allowing different settings for demoting various individual groups of
>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have
>>> > to
>>> > discuss an extension of the CSL schema – but currently I don’t really
>>> > think
>>> > that’s necessary.
>>> >
>>> >
>>> > ------------------------------------------------------------------------------
>>> >
>>> > _______________________________________________
>>> > xbiblio-devel mailing list
>>> > [hidden email]
>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>> >
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

fbennett
It could be documented >ducks<. Or you could have first-run guidance.
It's a pretty straightforward distinction, easy to remember once
you're exposed to it once.

Things will be a lot easier to document now that the parsing is driven
by a proper per-particle specification. The behavior is much more
well-defined than it was previously.


On Thu, Jul 23, 2015 at 11:48 PM, Rintze Zelle <[hidden email]> wrote:

> How is a regular Zotero user going to discover that that's possible, though?
>
> Rintze
>
> On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
> <[hidden email]> wrote:
>> Though the dropping particle in Rintze's example can already be defined
>> explicitly via first name field, so it doesn't undergo any parsing anyway.
>>
>> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]>
>> wrote:
>>>
>>> I agree with Rintze about a more explicit UI and that may come in the
>>> future (probably not for 5.0). I would still like to have automatic parsing
>>> and have that work correctly 99% of the time. The explicit UI would only be
>>> necessary where automatic parsing fails.
>>>
>>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
>>>>
>>>> I searched around a bit, and I agree that "Jean de La Fontaine" might
>>>> not be the best example. Better examples might be "Ludwig van
>>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>>>> particle). Then we get:
>>>>
>>>> Display order with "demote-non-dropping-particle" set to “never” or
>>>> “sort-only”:
>>>> "Beethoven, Ludwig van"
>>>> "van Gogh, Vincent"
>>>>
>>>> Display order with "demote-non-dropping-particle" set to
>>>> “display-and-sort”:
>>>> "Beethoven, Ludwig van"
>>>> "Gogh, Vincent van"
>>>>
>>>> As the example above shows, "van" has an ambiguous particle type and
>>>> we thus cannot rely on automatic parsing of two-field name fields
>>>> (given and family name) like those used in the Zotero UI to identify
>>>> particles and assign them as dropping or non-dropping. The CSL spec
>>>> currently doesn't discuss this type of parsing, since it assumes fully
>>>> structured metadata. But it's clear that the particle parsing process
>>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>>>> I'm really not a fan of protecting names in double quotation marks. I
>>>> think the best option would be for the Zotero UI to be more explicit
>>>> about particles, e.g. by offering a multi-part name field (given,
>>>> dropping particle, non-dropping particle, family, and suffix).
>>>>
>>>> Rintze
>>>>
>>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>>>> wrote:
>>>> > This is to proceed with a discussion started on
>>>> >
>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>>>> >
>>>> > While the CSL schema in its current form seems adequate for dealing
>>>> > with
>>>> > non-dropping particles in European and Arabic names, I feel some
>>>> > aspects of
>>>> > interpretation need to be reviewed:
>>>> >
>>>> > In a nutshell, I argue that “van den”, “al-” and friends are genuine
>>>> > non-dropping particles, but “La” and possibly a few others are not and
>>>> > are
>>>> > best seen as parts of a single multipart last name (just like “Van” in
>>>> > Belgian or American names, e.g., “Van Rompuy”).
>>>> >
>>>> > The following is copied from
>>>> >
>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>>>> >
>>>> > Certain names start with non-dropping particles, where “non-dropping”
>>>> > means
>>>> > these particles have to appear in in-text citations (“van den Keere”,
>>>> > “al-Hakim”) but may or may not be dropped in a bibliography for sorting
>>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort
>>>> > under
>>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
>>>> > den”).
>>>> >
>>>> > The Chicago Manual clearly recommends the sort-and-display variant
>>>> > (16e:
>>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>>>> > Chicago
>>>> > styles should switch to
>>>> > `demote-non-dropping-particle="display-and-sort"`.
>>>> >
>>>> > By contrast, any last name that does not function this way, i.e., where
>>>> > elements are never removed from the front for purposes of sorting or
>>>> > display, or in other words, where the last name is always used in one
>>>> > and
>>>> > the same form only throughout a document, both in text and in a
>>>> > bibliography, should be parsed as one multipart last name.
>>>> >
>>>> > For example, I would argue that “La Fontaine” should be understood,
>>>> > contra
>>>> > the examples given in
>>>> > http://docs.citationstyles.org/en/stable/specification.html, as one
>>>> > single
>>>> > multipart last name, since “Fontaine” never seems to be used alone,
>>>> > neither
>>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>>>> > crossreference pointing to “La Fontaine”, but that’s nothing currently
>>>> > implemented in CSL anyway).
>>>> >
>>>> > Parsing such “immutable” last names as multipart last names will most
>>>> > likely
>>>> > take care of all “potential objections to demoting the particle when
>>>> > demote-non-dropping-particle="display-and-sort" is applied for European
>>>> > name
>>>> > formatting” [fbennett] referred to earlier in this thread.
>>>> >
>>>> > If this seems acceptable so far, it would also mean that some of
>>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
>>>> > Protecting such names by wrapping them in double quotation marks would
>>>> > serve
>>>> > as a workaround, of course.
>>>> >
>>>> > On the other hand, if a genuine need is felt to have more flexibility,
>>>> > e.g.,
>>>> > allowing different settings for demoting various individual groups of
>>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have
>>>> > to
>>>> > discuss an extension of the CSL schema – but currently I don’t really
>>>> > think
>>>> > that’s necessary.
>>>> >
>>>> >
>>>> > ------------------------------------------------------------------------------
>>>> >
>>>> > _______________________________________________
>>>> > xbiblio-devel mailing list
>>>> > [hidden email]
>>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>> >
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> xbiblio-devel mailing list
>>>> [hidden email]
>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

fbennett
For what it's worth (and it's not a point that I would press hard in
the face of strong opposition), I'm not a fan of adding fields to the
UI for particle-purposes. I think it would make manual entry a real
pain, and code maintenance would not be fun.

On Fri, Jul 24, 2015 at 12:04 AM, Frank Bennett <[hidden email]> wrote:

> It could be documented >ducks<. Or you could have first-run guidance.
> It's a pretty straightforward distinction, easy to remember once
> you're exposed to it once.
>
> Things will be a lot easier to document now that the parsing is driven
> by a proper per-particle specification. The behavior is much more
> well-defined than it was previously.
>
>
> On Thu, Jul 23, 2015 at 11:48 PM, Rintze Zelle <[hidden email]> wrote:
>> How is a regular Zotero user going to discover that that's possible, though?
>>
>> Rintze
>>
>> On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
>> <[hidden email]> wrote:
>>> Though the dropping particle in Rintze's example can already be defined
>>> explicitly via first name field, so it doesn't undergo any parsing anyway.
>>>
>>> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]>
>>> wrote:
>>>>
>>>> I agree with Rintze about a more explicit UI and that may come in the
>>>> future (probably not for 5.0). I would still like to have automatic parsing
>>>> and have that work correctly 99% of the time. The explicit UI would only be
>>>> necessary where automatic parsing fails.
>>>>
>>>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
>>>>>
>>>>> I searched around a bit, and I agree that "Jean de La Fontaine" might
>>>>> not be the best example. Better examples might be "Ludwig van
>>>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>>>>> particle). Then we get:
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to “never” or
>>>>> “sort-only”:
>>>>> "Beethoven, Ludwig van"
>>>>> "van Gogh, Vincent"
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to
>>>>> “display-and-sort”:
>>>>> "Beethoven, Ludwig van"
>>>>> "Gogh, Vincent van"
>>>>>
>>>>> As the example above shows, "van" has an ambiguous particle type and
>>>>> we thus cannot rely on automatic parsing of two-field name fields
>>>>> (given and family name) like those used in the Zotero UI to identify
>>>>> particles and assign them as dropping or non-dropping. The CSL spec
>>>>> currently doesn't discuss this type of parsing, since it assumes fully
>>>>> structured metadata. But it's clear that the particle parsing process
>>>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>>>>> I'm really not a fan of protecting names in double quotation marks. I
>>>>> think the best option would be for the Zotero UI to be more explicit
>>>>> about particles, e.g. by offering a multi-part name field (given,
>>>>> dropping particle, non-dropping particle, family, and suffix).
>>>>>
>>>>> Rintze
>>>>>
>>>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>>>>> wrote:
>>>>> > This is to proceed with a discussion started on
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>>>>> >
>>>>> > While the CSL schema in its current form seems adequate for dealing
>>>>> > with
>>>>> > non-dropping particles in European and Arabic names, I feel some
>>>>> > aspects of
>>>>> > interpretation need to be reviewed:
>>>>> >
>>>>> > In a nutshell, I argue that “van den”, “al-” and friends are genuine
>>>>> > non-dropping particles, but “La” and possibly a few others are not and
>>>>> > are
>>>>> > best seen as parts of a single multipart last name (just like “Van” in
>>>>> > Belgian or American names, e.g., “Van Rompuy”).
>>>>> >
>>>>> > The following is copied from
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>>>>> >
>>>>> > Certain names start with non-dropping particles, where “non-dropping”
>>>>> > means
>>>>> > these particles have to appear in in-text citations (“van den Keere”,
>>>>> > “al-Hakim”) but may or may not be dropped in a bibliography for sorting
>>>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort
>>>>> > under
>>>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
>>>>> > den”).
>>>>> >
>>>>> > The Chicago Manual clearly recommends the sort-and-display variant
>>>>> > (16e:
>>>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>>>>> > Chicago
>>>>> > styles should switch to
>>>>> > `demote-non-dropping-particle="display-and-sort"`.
>>>>> >
>>>>> > By contrast, any last name that does not function this way, i.e., where
>>>>> > elements are never removed from the front for purposes of sorting or
>>>>> > display, or in other words, where the last name is always used in one
>>>>> > and
>>>>> > the same form only throughout a document, both in text and in a
>>>>> > bibliography, should be parsed as one multipart last name.
>>>>> >
>>>>> > For example, I would argue that “La Fontaine” should be understood,
>>>>> > contra
>>>>> > the examples given in
>>>>> > http://docs.citationstyles.org/en/stable/specification.html, as one
>>>>> > single
>>>>> > multipart last name, since “Fontaine” never seems to be used alone,
>>>>> > neither
>>>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>>>>> > crossreference pointing to “La Fontaine”, but that’s nothing currently
>>>>> > implemented in CSL anyway).
>>>>> >
>>>>> > Parsing such “immutable” last names as multipart last names will most
>>>>> > likely
>>>>> > take care of all “potential objections to demoting the particle when
>>>>> > demote-non-dropping-particle="display-and-sort" is applied for European
>>>>> > name
>>>>> > formatting” [fbennett] referred to earlier in this thread.
>>>>> >
>>>>> > If this seems acceptable so far, it would also mean that some of
>>>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
>>>>> > Protecting such names by wrapping them in double quotation marks would
>>>>> > serve
>>>>> > as a workaround, of course.
>>>>> >
>>>>> > On the other hand, if a genuine need is felt to have more flexibility,
>>>>> > e.g.,
>>>>> > allowing different settings for demoting various individual groups of
>>>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have
>>>>> > to
>>>>> > discuss an extension of the CSL schema – but currently I don’t really
>>>>> > think
>>>>> > that’s necessary.
>>>>> >
>>>>> >
>>>>> > ------------------------------------------------------------------------------
>>>>> >
>>>>> > _______________________________________________
>>>>> > xbiblio-devel mailing list
>>>>> > [hidden email]
>>>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>> >
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> _______________________________________________
>>>>> xbiblio-devel mailing list
>>>>> [hidden email]
>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

Bruce D'Arcus-3
I wonder if a workaround would be to have some toggle switch that would turn off the parsing for a specific name?

On Thu, Jul 23, 2015 at 11:07 AM, Frank Bennett <[hidden email]> wrote:
For what it's worth (and it's not a point that I would press hard in
the face of strong opposition), I'm not a fan of adding fields to the
UI for particle-purposes. I think it would make manual entry a real
pain, and code maintenance would not be fun.

On Fri, Jul 24, 2015 at 12:04 AM, Frank Bennett <[hidden email]> wrote:
> It could be documented >ducks<. Or you could have first-run guidance.
> It's a pretty straightforward distinction, easy to remember once
> you're exposed to it once.
>
> Things will be a lot easier to document now that the parsing is driven
> by a proper per-particle specification. The behavior is much more
> well-defined than it was previously.
>
>
> On Thu, Jul 23, 2015 at 11:48 PM, Rintze Zelle <[hidden email]> wrote:
>> How is a regular Zotero user going to discover that that's possible, though?
>>
>> Rintze
>>
>> On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
>> <[hidden email]> wrote:
>>> Though the dropping particle in Rintze's example can already be defined
>>> explicitly via first name field, so it doesn't undergo any parsing anyway.
>>>
>>> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]>
>>> wrote:
>>>>
>>>> I agree with Rintze about a more explicit UI and that may come in the
>>>> future (probably not for 5.0). I would still like to have automatic parsing
>>>> and have that work correctly 99% of the time. The explicit UI would only be
>>>> necessary where automatic parsing fails.
>>>>
>>>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
>>>>>
>>>>> I searched around a bit, and I agree that "Jean de La Fontaine" might
>>>>> not be the best example. Better examples might be "Ludwig van
>>>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>>>>> particle). Then we get:
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to “never” or
>>>>> “sort-only”:
>>>>> "Beethoven, Ludwig van"
>>>>> "van Gogh, Vincent"
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to
>>>>> “display-and-sort”:
>>>>> "Beethoven, Ludwig van"
>>>>> "Gogh, Vincent van"
>>>>>
>>>>> As the example above shows, "van" has an ambiguous particle type and
>>>>> we thus cannot rely on automatic parsing of two-field name fields
>>>>> (given and family name) like those used in the Zotero UI to identify
>>>>> particles and assign them as dropping or non-dropping. The CSL spec
>>>>> currently doesn't discuss this type of parsing, since it assumes fully
>>>>> structured metadata. But it's clear that the particle parsing process
>>>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>>>>> I'm really not a fan of protecting names in double quotation marks. I
>>>>> think the best option would be for the Zotero UI to be more explicit
>>>>> about particles, e.g. by offering a multi-part name field (given,
>>>>> dropping particle, non-dropping particle, family, and suffix).
>>>>>
>>>>> Rintze
>>>>>
>>>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>>>>> wrote:
>>>>> > This is to proceed with a discussion started on
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>>>>> >
>>>>> > While the CSL schema in its current form seems adequate for dealing
>>>>> > with
>>>>> > non-dropping particles in European and Arabic names, I feel some
>>>>> > aspects of
>>>>> > interpretation need to be reviewed:
>>>>> >
>>>>> > In a nutshell, I argue that “van den”, “al-” and friends are genuine
>>>>> > non-dropping particles, but “La” and possibly a few others are not and
>>>>> > are
>>>>> > best seen as parts of a single multipart last name (just like “Van” in
>>>>> > Belgian or American names, e.g., “Van Rompuy”).
>>>>> >
>>>>> > The following is copied from
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>>>>> >
>>>>> > Certain names start with non-dropping particles, where “non-dropping”
>>>>> > means
>>>>> > these particles have to appear in in-text citations (“van den Keere”,
>>>>> > “al-Hakim”) but may or may not be dropped in a bibliography for sorting
>>>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort
>>>>> > under
>>>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
>>>>> > den”).
>>>>> >
>>>>> > The Chicago Manual clearly recommends the sort-and-display variant
>>>>> > (16e:
>>>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>>>>> > Chicago
>>>>> > styles should switch to
>>>>> > `demote-non-dropping-particle="display-and-sort"`.
>>>>> >
>>>>> > By contrast, any last name that does not function this way, i.e., where
>>>>> > elements are never removed from the front for purposes of sorting or
>>>>> > display, or in other words, where the last name is always used in one
>>>>> > and
>>>>> > the same form only throughout a document, both in text and in a
>>>>> > bibliography, should be parsed as one multipart last name.
>>>>> >
>>>>> > For example, I would argue that “La Fontaine” should be understood,
>>>>> > contra
>>>>> > the examples given in
>>>>> > http://docs.citationstyles.org/en/stable/specification.html, as one
>>>>> > single
>>>>> > multipart last name, since “Fontaine” never seems to be used alone,
>>>>> > neither
>>>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>>>>> > crossreference pointing to “La Fontaine”, but that’s nothing currently
>>>>> > implemented in CSL anyway).
>>>>> >
>>>>> > Parsing such “immutable” last names as multipart last names will most
>>>>> > likely
>>>>> > take care of all “potential objections to demoting the particle when
>>>>> > demote-non-dropping-particle="display-and-sort" is applied for European
>>>>> > name
>>>>> > formatting” [fbennett] referred to earlier in this thread.
>>>>> >
>>>>> > If this seems acceptable so far, it would also mean that some of
>>>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
>>>>> > Protecting such names by wrapping them in double quotation marks would
>>>>> > serve
>>>>> > as a workaround, of course.
>>>>> >
>>>>> > On the other hand, if a genuine need is felt to have more flexibility,
>>>>> > e.g.,
>>>>> > allowing different settings for demoting various individual groups of
>>>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have
>>>>> > to
>>>>> > discuss an extension of the CSL schema – but currently I don’t really
>>>>> > think
>>>>> > that’s necessary.
>>>>> >
>>>>> >
>>>>> > ------------------------------------------------------------------------------
>>>>> >
>>>>> > _______________________________________________
>>>>> > xbiblio-devel mailing list
>>>>> > [hidden email]
>>>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>> >
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> _______________________________________________
>>>>> xbiblio-devel mailing list
>>>>> [hidden email]
>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

aurimas
In reply to this post by fbennett
That seems like something we should move to Zotero forums. In any case, like I said before, the automatic parsing is still useful, so improving that would be great!

On Thu, Jul 23, 2015 at 10:07 AM, Frank Bennett <[hidden email]> wrote:
For what it's worth (and it's not a point that I would press hard in
the face of strong opposition), I'm not a fan of adding fields to the
UI for particle-purposes. I think it would make manual entry a real
pain, and code maintenance would not be fun.

On Fri, Jul 24, 2015 at 12:04 AM, Frank Bennett <[hidden email]> wrote:
> It could be documented >ducks<. Or you could have first-run guidance.
> It's a pretty straightforward distinction, easy to remember once
> you're exposed to it once.
>
> Things will be a lot easier to document now that the parsing is driven
> by a proper per-particle specification. The behavior is much more
> well-defined than it was previously.
>
>
> On Thu, Jul 23, 2015 at 11:48 PM, Rintze Zelle <[hidden email]> wrote:
>> How is a regular Zotero user going to discover that that's possible, though?
>>
>> Rintze
>>
>> On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
>> <[hidden email]> wrote:
>>> Though the dropping particle in Rintze's example can already be defined
>>> explicitly via first name field, so it doesn't undergo any parsing anyway.
>>>
>>> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius" <[hidden email]>
>>> wrote:
>>>>
>>>> I agree with Rintze about a more explicit UI and that may come in the
>>>> future (probably not for 5.0). I would still like to have automatic parsing
>>>> and have that work correctly 99% of the time. The explicit UI would only be
>>>> necessary where automatic parsing fails.
>>>>
>>>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]> wrote:
>>>>>
>>>>> I searched around a bit, and I agree that "Jean de La Fontaine" might
>>>>> not be the best example. Better examples might be "Ludwig van
>>>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>>>>> particle). Then we get:
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to “never” or
>>>>> “sort-only”:
>>>>> "Beethoven, Ludwig van"
>>>>> "van Gogh, Vincent"
>>>>>
>>>>> Display order with "demote-non-dropping-particle" set to
>>>>> “display-and-sort”:
>>>>> "Beethoven, Ludwig van"
>>>>> "Gogh, Vincent van"
>>>>>
>>>>> As the example above shows, "van" has an ambiguous particle type and
>>>>> we thus cannot rely on automatic parsing of two-field name fields
>>>>> (given and family name) like those used in the Zotero UI to identify
>>>>> particles and assign them as dropping or non-dropping. The CSL spec
>>>>> currently doesn't discuss this type of parsing, since it assumes fully
>>>>> structured metadata. But it's clear that the particle parsing process
>>>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>>>>> I'm really not a fan of protecting names in double quotation marks. I
>>>>> think the best option would be for the Zotero UI to be more explicit
>>>>> about particles, e.g. by offering a multi-part name field (given,
>>>>> dropping particle, non-dropping particle, family, and suffix).
>>>>>
>>>>> Rintze
>>>>>
>>>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>>>>> wrote:
>>>>> > This is to proceed with a discussion started on
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>>>>> >
>>>>> > While the CSL schema in its current form seems adequate for dealing
>>>>> > with
>>>>> > non-dropping particles in European and Arabic names, I feel some
>>>>> > aspects of
>>>>> > interpretation need to be reviewed:
>>>>> >
>>>>> > In a nutshell, I argue that “van den”, “al-” and friends are genuine
>>>>> > non-dropping particles, but “La” and possibly a few others are not and
>>>>> > are
>>>>> > best seen as parts of a single multipart last name (just like “Van” in
>>>>> > Belgian or American names, e.g., “Van Rompuy”).
>>>>> >
>>>>> > The following is copied from
>>>>> >
>>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>>>>> >
>>>>> > Certain names start with non-dropping particles, where “non-dropping”
>>>>> > means
>>>>> > these particles have to appear in in-text citations (“van den Keere”,
>>>>> > “al-Hakim”) but may or may not be dropped in a bibliography for sorting
>>>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter” [sort
>>>>> > under
>>>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter van
>>>>> > den”).
>>>>> >
>>>>> > The Chicago Manual clearly recommends the sort-and-display variant
>>>>> > (16e:
>>>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>>>>> > Chicago
>>>>> > styles should switch to
>>>>> > `demote-non-dropping-particle="display-and-sort"`.
>>>>> >
>>>>> > By contrast, any last name that does not function this way, i.e., where
>>>>> > elements are never removed from the front for purposes of sorting or
>>>>> > display, or in other words, where the last name is always used in one
>>>>> > and
>>>>> > the same form only throughout a document, both in text and in a
>>>>> > bibliography, should be parsed as one multipart last name.
>>>>> >
>>>>> > For example, I would argue that “La Fontaine” should be understood,
>>>>> > contra
>>>>> > the examples given in
>>>>> > http://docs.citationstyles.org/en/stable/specification.html, as one
>>>>> > single
>>>>> > multipart last name, since “Fontaine” never seems to be used alone,
>>>>> > neither
>>>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>>>>> > crossreference pointing to “La Fontaine”, but that’s nothing currently
>>>>> > implemented in CSL anyway).
>>>>> >
>>>>> > Parsing such “immutable” last names as multipart last names will most
>>>>> > likely
>>>>> > take care of all “potential objections to demoting the particle when
>>>>> > demote-non-dropping-particle="display-and-sort" is applied for European
>>>>> > name
>>>>> > formatting” [fbennett] referred to earlier in this thread.
>>>>> >
>>>>> > If this seems acceptable so far, it would also mean that some of
>>>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on “La”.
>>>>> > Protecting such names by wrapping them in double quotation marks would
>>>>> > serve
>>>>> > as a workaround, of course.
>>>>> >
>>>>> > On the other hand, if a genuine need is felt to have more flexibility,
>>>>> > e.g.,
>>>>> > allowing different settings for demoting various individual groups of
>>>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d have
>>>>> > to
>>>>> > discuss an extension of the CSL schema – but currently I don’t really
>>>>> > think
>>>>> > that’s necessary.
>>>>> >
>>>>> >
>>>>> > ------------------------------------------------------------------------------
>>>>> >
>>>>> > _______________________________________________
>>>>> > xbiblio-devel mailing list
>>>>> > [hidden email]
>>>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>> >
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> _______________________________________________
>>>>> xbiblio-devel mailing list
>>>>> [hidden email]
>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

rmzelle
Administrator
Well, other tools are bound to use a two-field setup as well, so there
is some merit in discussing it here.

Some kind of color-coding and/or a tooltip in the name fields showing
how a name is parsed could help as well. Or Zotero could offer a popup
box to help the user format individual names (e.g. accessible via a
right-click menu option). That would create a lot more space to
provide feedback. E.g. it could explain the role of the single-field
and two-field switch, and provide assistance in formatting the
particles (with examples and live preview of particle parsing).

Rintze

On Thu, Jul 23, 2015 at 11:47 AM, Aurimas Vinckevicius
<[hidden email]> wrote:

> That seems like something we should move to Zotero forums. In any case, like
> I said before, the automatic parsing is still useful, so improving that
> would be great!
>
> On Thu, Jul 23, 2015 at 10:07 AM, Frank Bennett <[hidden email]>
> wrote:
>>
>> For what it's worth (and it's not a point that I would press hard in
>> the face of strong opposition), I'm not a fan of adding fields to the
>> UI for particle-purposes. I think it would make manual entry a real
>> pain, and code maintenance would not be fun.
>>
>> On Fri, Jul 24, 2015 at 12:04 AM, Frank Bennett <[hidden email]>
>> wrote:
>> > It could be documented >ducks<. Or you could have first-run guidance.
>> > It's a pretty straightforward distinction, easy to remember once
>> > you're exposed to it once.
>> >
>> > Things will be a lot easier to document now that the parsing is driven
>> > by a proper per-particle specification. The behavior is much more
>> > well-defined than it was previously.
>> >
>> >
>> > On Thu, Jul 23, 2015 at 11:48 PM, Rintze Zelle <[hidden email]>
>> > wrote:
>> >> How is a regular Zotero user going to discover that that's possible,
>> >> though?
>> >>
>> >> Rintze
>> >>
>> >> On Thu, Jul 23, 2015 at 10:44 AM, Aurimas Vinckevicius
>> >> <[hidden email]> wrote:
>> >>> Though the dropping particle in Rintze's example can already be
>> >>> defined
>> >>> explicitly via first name field, so it doesn't undergo any parsing
>> >>> anyway.
>> >>>
>> >>> On Jul 23, 2015 9:40 AM, "Aurimas Vinckevicius"
>> >>> <[hidden email]>
>> >>> wrote:
>> >>>>
>> >>>> I agree with Rintze about a more explicit UI and that may come in the
>> >>>> future (probably not for 5.0). I would still like to have automatic
>> >>>> parsing
>> >>>> and have that work correctly 99% of the time. The explicit UI would
>> >>>> only be
>> >>>> necessary where automatic parsing fails.
>> >>>>
>> >>>> On Jul 23, 2015 9:30 AM, "Rintze Zelle" <[hidden email]>
>> >>>> wrote:
>> >>>>>
>> >>>>> I searched around a bit, and I agree that "Jean de La Fontaine"
>> >>>>> might
>> >>>>> not be the best example. Better examples might be "Ludwig van
>> >>>>> Beethoven" (dropping particle) and "Vincent van Gogh" (non-dropping
>> >>>>> particle). Then we get:
>> >>>>>
>> >>>>> Display order with "demote-non-dropping-particle" set to “never” or
>> >>>>> “sort-only”:
>> >>>>> "Beethoven, Ludwig van"
>> >>>>> "van Gogh, Vincent"
>> >>>>>
>> >>>>> Display order with "demote-non-dropping-particle" set to
>> >>>>> “display-and-sort”:
>> >>>>> "Beethoven, Ludwig van"
>> >>>>> "Gogh, Vincent van"
>> >>>>>
>> >>>>> As the example above shows, "van" has an ambiguous particle type and
>> >>>>> we thus cannot rely on automatic parsing of two-field name fields
>> >>>>> (given and family name) like those used in the Zotero UI to identify
>> >>>>> particles and assign them as dropping or non-dropping. The CSL spec
>> >>>>> currently doesn't discuss this type of parsing, since it assumes
>> >>>>> fully
>> >>>>> structured metadata. But it's clear that the particle parsing
>> >>>>> process
>> >>>>> is by far the most opaque aspect of Zotero/CSL's particle treatment.
>> >>>>> I'm really not a fan of protecting names in double quotation marks.
>> >>>>> I
>> >>>>> think the best option would be for the Zotero UI to be more explicit
>> >>>>> about particles, e.g. by offering a multi-part name field (given,
>> >>>>> dropping particle, non-dropping particle, family, and suffix).
>> >>>>>
>> >>>>> Rintze
>> >>>>>
>> >>>>> On Thu, Jul 23, 2015 at 6:58 AM, Nick Bart <[hidden email]>
>> >>>>> wrote:
>> >>>>> > This is to proceed with a discussion started on
>> >>>>> >
>> >>>>> >
>> >>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/.
>> >>>>> >
>> >>>>> > While the CSL schema in its current form seems adequate for
>> >>>>> > dealing
>> >>>>> > with
>> >>>>> > non-dropping particles in European and Arabic names, I feel some
>> >>>>> > aspects of
>> >>>>> > interpretation need to be reviewed:
>> >>>>> >
>> >>>>> > In a nutshell, I argue that “van den”, “al-” and friends are
>> >>>>> > genuine
>> >>>>> > non-dropping particles, but “La” and possibly a few others are not
>> >>>>> > and
>> >>>>> > are
>> >>>>> > best seen as parts of a single multipart last name (just like
>> >>>>> > “Van” in
>> >>>>> > Belgian or American names, e.g., “Van Rompuy”).
>> >>>>> >
>> >>>>> > The following is copied from
>> >>>>> >
>> >>>>> >
>> >>>>> > https://forums.zotero.org/discussion/30974/2/any-idea-why-an-a-author-comes-last-in-the-bibliography/:
>> >>>>> >
>> >>>>> > Certain names start with non-dropping particles, where
>> >>>>> > “non-dropping”
>> >>>>> > means
>> >>>>> > these particles have to appear in in-text citations (“van den
>> >>>>> > Keere”,
>> >>>>> > “al-Hakim”) but may or may not be dropped in a bibliography for
>> >>>>> > sorting
>> >>>>> > (“al-Hakim, Tawfiq” [sort under “H”], “van den Keere, Pieter”
>> >>>>> > [sort
>> >>>>> > under
>> >>>>> > “K”]), or sorting and display (“Hakim, Tawfiq al-”, “Keere, Pieter
>> >>>>> > van
>> >>>>> > den”).
>> >>>>> >
>> >>>>> > The Chicago Manual clearly recommends the sort-and-display variant
>> >>>>> > (16e:
>> >>>>> > 8.10, 8.14, 16.71, 16.76); that’s why I would argue that all CSL
>> >>>>> > Chicago
>> >>>>> > styles should switch to
>> >>>>> > `demote-non-dropping-particle="display-and-sort"`.
>> >>>>> >
>> >>>>> > By contrast, any last name that does not function this way, i.e.,
>> >>>>> > where
>> >>>>> > elements are never removed from the front for purposes of sorting
>> >>>>> > or
>> >>>>> > display, or in other words, where the last name is always used in
>> >>>>> > one
>> >>>>> > and
>> >>>>> > the same form only throughout a document, both in text and in a
>> >>>>> > bibliography, should be parsed as one multipart last name.
>> >>>>> >
>> >>>>> > For example, I would argue that “La Fontaine” should be
>> >>>>> > understood,
>> >>>>> > contra
>> >>>>> > the examples given in
>> >>>>> > http://docs.citationstyles.org/en/stable/specification.html, as
>> >>>>> > one
>>
>> >>>>> > single
>> >>>>> > multipart last name, since “Fontaine” never seems to be used
>> >>>>> > alone,
>> >>>>> > neither
>> >>>>> > for sorting nor display (I’ve sometimes seen “Fontaine” used as a
>> >>>>> > crossreference pointing to “La Fontaine”, but that’s nothing
>> >>>>> > currently
>> >>>>> > implemented in CSL anyway).
>> >>>>> >
>> >>>>> > Parsing such “immutable” last names as multipart last names will
>> >>>>> > most
>> >>>>> > likely
>> >>>>> > take care of all “potential objections to demoting the particle
>> >>>>> > when
>> >>>>> > demote-non-dropping-particle="display-and-sort" is applied for
>> >>>>> > European
>> >>>>> > name
>> >>>>> > formatting” [fbennett] referred to earlier in this thread.
>> >>>>> >
>> >>>>> > If this seems acceptable so far, it would also mean that some of
>> >>>>> > citeproc-js’s parsing rules need to be reviewed, e.g., the one on
>> >>>>> > “La”.
>> >>>>> > Protecting such names by wrapping them in double quotation marks
>> >>>>> > would
>> >>>>> > serve
>> >>>>> > as a workaround, of course.
>> >>>>> >
>> >>>>> > On the other hand, if a genuine need is felt to have more
>> >>>>> > flexibility,
>> >>>>> > e.g.,
>> >>>>> > allowing different settings for demoting various individual groups
>> >>>>> > of
>> >>>>> > non-dropping-particles (e.g., “al-” vs. “van den” vs. “La”) we’d
>> >>>>> > have
>> >>>>> > to
>> >>>>> > discuss an extension of the CSL schema – but currently I don’t
>> >>>>> > really
>> >>>>> > think
>> >>>>> > that’s necessary.
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > ------------------------------------------------------------------------------
>> >>>>> >
>> >>>>> > _______________________________________________
>> >>>>> > xbiblio-devel mailing list
>> >>>>> > [hidden email]
>> >>>>> > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>> >>>>> >
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> ------------------------------------------------------------------------------
>> >>>>> _______________________________________________
>> >>>>> xbiblio-devel mailing list
>> >>>>> [hidden email]
>> >>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>> >>>
>> >>>
>> >>>
>> >>> ------------------------------------------------------------------------------
>> >>>
>> >>> _______________________________________________
>> >>> xbiblio-devel mailing list
>> >>> [hidden email]
>> >>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>> >>>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> _______________________________________________
>> >> xbiblio-devel mailing list
>> >> [hidden email]
>> >> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: non-dropping particles

Nick Bart
Before inviting feedback on a number of questions, here’s my reasoning again: According to the Chicago Manual of Style, 16e, 8.10, 16.71, “Pieter van den Keere” needs to appear in the text (leaving capitalisation issues aside) as “van den Keere” and in the bibliography as “Keere, Pieter van den”. The same applies for “Tawfiq al-Hakim”: “al-Hakim” and “Hakim, Tawfiq al-” (CMS 8.14, 16.76). This requires “van den” and “al-” to be entered or parsed as a non-dropping particles, and “demote-non-dropping-particle” to be set to “display-and-sort”. This in turn requires names such as “La Fontaine” to be entered/parsed as one multi-part family name rather than what the CSL specs used to suggest, “La” as non-dropping-particle and “Fontaine” as family name, or else we’d end up with the incorrect “Fontaine, Jean de La”. (Parsing “La Fontaine” as one multi-part family name seems appropriate anyway, since to the best of my knowledge the two elements of “La Fontaine” are never separated in any circumstances.) This again requires adjusting citeproc-js’s (and hopefully soon, Zotero’s) name parsing algorithm.

So my proposal is (1) to set “demote-non-dropping-particle” to “display-and-sort” in all Chicago styles (and, most likely, other styles, too), (2) to remove “La” and other strings that aren’t genuine non-dropping particles from the CSL specs and the list citeproc-js uses for parsing, and (3), to make citeproc-js’s name parsing algorithm not only field- but also case-specific: Field-specific means parsing ambiguous strings according to whether they are found at the front of the family field (-> non-dropping) or at the end of the first field (-> dropping); citeproc-js can do this. Case-specific means distinguishing, e.g., “Van” and “van”, and parsing, e.g., “Van Rompuy” as one multi-part family name, but splitting “van Gogh” into a non-dropping-particle “van” and a (root) family name “Gogh”. Since I haven’t been able to find _any_ upper-case elements that would still count as dropping or non-dropping particles in this scheme, we might even be able to simplify the parsing algorithm to “lower-case strings at the front of the family field are parsed as non-dropping particles, lower-case strings at the end of the given field are parsed as dropping particles”.

Note that even with field- and case-sensitive particle identification there are still a few strings that are ambiguous, and thus in some cases a name in the family field still needs to be protected for correct parsing (i.e., wrapped in quotes; this is an existing citeproc-js feature):

- A French “Paul de Man” (“de” = dropping particle) is entered as [Man] [Paul de];
- a Dutch (“de” = non-dropping particle) as [de Man] [Paul];
- but for an American(ised) “Paul de Man” (CMS 8.5, “de” = part of family name), the family name will still have to be wrapped in quotes, ["de Man"] [Paul], in order to be parsed correctly as one multi-part family name.

Now, the questions:

- Is there anything wrong with this reasoning?
- Is there anything problematic about these proposals?

And, more specifically:

- Is anyone aware of style guides or other authoritative sources that would call for treating particles, especially non-dropping ones, differently from what CMS recommends? (In particular, anything that could _not_ be solved by setting “demote-non-dropping-particle” to “sort-only” or “never”? – Would a Dutch publication prefer “sort-only”?)
- Is anyone aware of upper-case name elements that are genuine _non-dropping_ particles, i.e., would have to appear as “Bla Doe” in the text but as “Doe, Paul Bla” in the bibliography? (All non-dropping particles I’ve come across so far are lower-case.)
    - Regarding Arabic names, would anyone ever want to display “Tawfiq Al-Hakim” as “Al-Hakim” and “Hakim, Tawfiq Al-”? Or would the use of upper case typically indicate that “Al-”/“El-” should be seen as part of the family name rather than as a particle, and thus sorted under “A” or “E”?
- Is anyone aware of upper-case name elements that are genuine _dropping_ particles? (All dropping particles I’ve come across so far are lower-case.)
- Thus, is the rule “Unless it’s part of a family name (and thus wrapped in quotes), any lower-case string must be a particle” sound?
- Is anyone aware of style guides or other authoritative sources that would ever call for separating the elements of multi-part family names such as “La Fontaine” or “Van Rompuy” for sorting or display? (If there were, I fear we’d have to discuss reviewing the CSL specs …)
- Can anyone provide an example of a real name with both dropping and non-dropping particles? (“Jean de La Fontaine” no longer qualifies; “Jean de van Gogh”, if it existed, might.)
- As far as I see, all names with _genuine_ non-dropping particles are of Dutch or Arabic origin. Is anyone aware of others?
- What are your views on allowing the use of non-breaking spaces, like [de·Man] [Paul], for protecting multi-part family names from being parsed? (Prettier than quotes, but less obvious, and we’d still need the quotes for “d’Alembert” or “al-Hakim”, if these were ever found to need protection.)

Finally, though we need a good parsing solution now, of course none of this should keep us from working on a better UI that could eliminate the need for this awkward parsing of name fields altogether – though the algorithm might still be useful for parsing data upon import in the future.

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel