Skip to content

Instantly share code, notes, and snippets.

@rmzelle
Created June 16, 2011 18:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rmzelle/1029888 to your computer and use it in GitHub Desktop.
Save rmzelle/1029888 to your computer and use it in GitHub Desktop.
NEW
``disambiguate-add-year-suffix`` [Step (4)]
If set to "true" ("false" is the default), an alphabetic year-suffix is
added to ambiguous cites (e.g. "Doe 2007, Doe 2007" becomes "Doe 2007a, Doe
2007b") and to their corresponding bibliographic entries. The assignment of
the year-suffixes follows the order of the bibliographies entries, and
additional letters are used once "z" is reached ("z", "aa", "ab", ..., "az",
"ba", etc.). By default the year-suffix is appended to the cite, and to the
first year rendered through ``cs:date`` in the bibliographic entry, but its
location can be controlled by explicitly rendering the "year-suffix" variable
using ``cs:text``. If "year-suffix" is rendered through ``cs:text`` in the
scope of ``cs:citation``, it is suppressed for ``cs:bibliography``, unless
it is also rendered through ``cs:text`` in the scope of ``cs:bibliography``,
and vice versa.
OLD
``disambiguate-add-year-suffix`` [Step (4)]
If set to "true" ("false" is the default), a year-suffix is added to
ambiguous cites (e.g. "Doe 2007, Doe 2007" becomes "Doe 2007a, Doe 2007b").
The placement of the year-suffix, by default appended to each cite, can be
controlled by explictly rendering the "year-suffix" variable using
``cs:text``.
@rmzelle
Copy link
Author

rmzelle commented Jun 16, 2011

I'm lost in your earlier explanation (https://gist.github.com/1029888#gistcomment-36052). "1. one sorts and groups on first author only" --> are you referring to sorting in the bibliography?

As for your post directly above, you could get either result. If the cites show two names, e.g. "(Doe and Jones 1999a, b; Doe and Smith 1999)" you get the first list in the bibliography. If, on the other hand, you limit the number of names to one, e.g. "(Doe 1999a, b, c)", you get the second list. It all depends on which cites are ambiguous.

@bdarcus
Copy link

bdarcus commented Jun 16, 2011

I'm lost in your earlier explanation (https://gist.github.com/1029888#gistcomment-36052). "1. one sorts and groups on first author only" --> are you referring to sorting in the bibliography?

Yes.

As for your post directly above, you could get either result. If the cites show two names, e.g. "(Doe and Jones 1999a, b; Doe and Smith 1999)" you get the first list in the bibliography. If, on the other hand, you limit the number of names to one, e.g. "(Doe 1999a, b, c)", you get the second list. It all depends on which cites are ambiguous.

If this is true as a clear, unambiguous, rule, then I accept your conclusion (at least about this particular style; I'm still skeptical it's a very good general explanation for how author-date styles work, and so might lead implementors down a wrong path).

How certain of this are you though, and that dropping the suffix on the Doe and Smith example would be required (rather than optional)? The Neuroimage one with Hasan doesn't test this case, so I've just not seen this. In my view, this adds significant processing complexity to year-suffix handling, so we better be sure it's 100% necessary if we're going to put it in the spec.

Either way, this does highlight the need for really careful examples. Without those, subtle details get lost really easily.

We haven't even gotten to how one knows what an author is for an author group, or for author suppression in citation ;-)

@fbennett
Copy link

On the contrary, I'm pretty sure that Andrea and Sylvester have already been basing their work on the processing flow that Rintze and I have described. Andrea provided a set of carefully crafted disambiguation tests a year ago, two of which seem to cover exactly this case: AndreaEg3 and AndreaEg5.

If the tests are not on point, please craft one that covers the issue, so we can take a look at the precise requirements that you have in mind.

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@bdarcus
Copy link

bdarcus commented Jun 17, 2011

Actually, here's the earlier example I was wondering if we agree on (and so in part whether the explanation works), and whether we have a test for:

http://forums.zotero.org/discussion/16284/citations-not-collapsing/#Item_15

Aside: I would expect that output on the bibliography regardless of the citation rules (whether et al was turned on, etc.).

A quick ls *uffix* | grep "bibliography" of the test directory didn't turn anything up, but I don't know myself around the suite very well.

@fbennett
Copy link

There's no rush, but if you can provide a test that illustrates the issue, we'll all be on the same page. That was the original plan for these difficult ones. :)

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@fbennett
Copy link

We will if you provide one.

@bdarcus
Copy link

bdarcus commented Jun 17, 2011

Not entirely done, but can you take a look at this and tell me if I have it basically right (there are no existing styles for the bibliography mode), and what you suggest naming it?

https://gist.github.com/1030700

@fbennett
Copy link

Thanks. I've added the test to the suite, with a few fixes to the CSL and a few tweaks to the input data to align the names with the expected result. This one passes out of the box.

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@fbennett
Copy link

All of the names in the input data are different, although the initials for families Doe, Jones and Smith are all the same. If the initialize-with attribute is removed from the cs:name node in cs:citation, the year suffixes will disappear from the output, despite the fact that the entries are ambiguous in the bibliography.

That result is due to the underlying assumption I stated above, that the bibliography entries contain enough information to identify each entry uniquely, and that cites contain less information. Removing initialize-with from the cs:citation name node would violate that expectation -- the cites would then contain more information than is available in the bibliography. That may seem at first blush to be a troublesome case, but it's one that we can safely ignore, since bibliographies and their companion cites are never set up that way in production.

@fbennett
Copy link

I've added two variants, with more and with less information in the citations. The year suffixes applied differ in each case, which is the expected behavior.

As a side note, in a recent trawl through old correspondence, I noticed a comment by Simon, two years ago, that disambiguation queries were one of the most frequent topics raised on the forums. That's no longer the case -- queries are now rare, and invariably resolved with a link to the disambiguation section of the wiki on zotero.org. I take that as evidence that we have gotten this right.

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@rmzelle
Copy link
Author

rmzelle commented Jun 17, 2011

APA: http://owl.english.purdue.edu/owl/resource/560/06/ (at "Two or More Works by the Same Author in the Same Year")

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@rmzelle
Copy link
Author

rmzelle commented Jun 17, 2011

This is what still has me uneasy; i just have never taken the view that suffix behavior has anything to do with the details of the cites.

In the absence of cites, year-suffixes are useless in bibliographies, even in cases where bibliographic entries would otherwise be identical, as they don't add any information. Their value is in establishing a unique and therefore unambiguous pointer/cite to a bibliographic entry.

@bdarcus
Copy link

bdarcus commented Jun 17, 2011 via email

@rmzelle
Copy link
Author

rmzelle commented Jun 17, 2011

There is also a relevant APA snippet at http://owl.english.purdue.edu/owl/resource/560/03/, again under "Two or More Works by the Same Author in the Same Year"

@bdarcus
Copy link

bdarcus commented Jun 17, 2011

In any case, I do suggest clarifying the proposed spec language by expanding the first sentence. E.g.:

If set to "true" ("false" is the default), applies to conditions where there are multiple items from the same author and year. Under these conditions, an alphabetic year-suffix is added to each of these bibliographic items, and to their corresponding cites (e.g. "Doe 2007, Doe 2007" becomes "Doe 2007a, Doe 2007b"). [insert exceptions, or further explanation]

@fbennett
Copy link

+1

@rmzelle
Copy link
Author

rmzelle commented Jun 18, 2011

That's not precise enough. Year-suffixes can be required even when the names aren't the same. E.g. (Doe 2000a, b) for a style that doesn't do name disambiguation, and has to disambiguate papers, one by John and one by Jane Doe.

@fbennett
Copy link

If that's true, then there may be two different rules to create a
suffix, and so consequently both of us are also wrong (your
explanation does not work for APA and Chicago, and mine doesn't work
for Elsevier, etc.).

There might be, but again, we haven't seen any examples that are not correctly handled under the existing specification. The APA, Chicago and Elsevier guides describe disambiguation in different language, but as they are editorial guides (rather than specification documents), none of them dictate a comprehensive set of mandatory rules, complete with a description of the logical sequence of operations needed for machine-driven disambiguation. The question should not be whether these guides can be read to require output that CSL is incapable of producing, but whether copy editors actually do read them in that way. It never hurts to gather more information about editorial processes, but at the moment we have zero evidence of that.

@fbennett
Copy link

That's not precise enough. Year-suffixes can be required even when the names aren't the same. E.g (Doe 2000a, b) for a style that doesn't do name disambiguation, and has to disambiguate papers, one by John and one by Jane Doe.

True, but the examples are sensible. Perhaps Bruce would be willing to see the text amended to read "multiple cites to the same author and year"?

@bdarcus
Copy link

bdarcus commented Jun 18, 2011 via email

@fbennett
Copy link

fbennett commented Jun 18, 2011 via email

@rmzelle
Copy link
Author

rmzelle commented Jun 18, 2011

I'll have to think about it a bit more.

@rmzelle
Copy link
Author

rmzelle commented Jun 20, 2011

The description of "disambiguate-add-year-suffix" should only discuss the reason for disambiguation if it differs from the reason for the other disambiguation options. Any shared requirements that activate disambiguation should be discussed in the introduction of the disambiguation section of the spec.

Would it be correct to say that, for all disambiguation methods, except for "disambiguate-add-givenname" with "givenname-disambiguation-rule" set to "all-names", "all-names-with-initials", "primary-name", or "primary-name-with-initials", disambiguation is performed to create an unambiguous link between the cite and the target bibliographic entry?

@fbennett
Copy link

Yes. In those four cases, adding initials or full given names is more aggressive than strictly necessary for resolving cite/bib ambiguities. If one were to be picky, it would be a little more accurate to say "disambiguation is performed only when needed to create an unambiguous link between the cite and the target bibliographic entry".

@bdarcus
Copy link

bdarcus commented Jun 20, 2011 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment