Skip to content

Instantly share code, notes, and snippets.

@alexkalderimis
Created March 3, 2016 16:14
Show Gist options
  • Save alexkalderimis/c9391542e4da419ddffd to your computer and use it in GitHub Desktop.
Save alexkalderimis/c9391542e4da419ddffd to your computer and use it in GitHub Desktop.

Integration Results - An Apologia

BIBHEADER:

This section tests the emission of processing instructions. This is required for things like export and simply making sure that we can correctly format the bibliography.

All 5 tests error in this group. This is because we don't support this functionality at all in citeproc-hs. Instead this functionality is implemented in citeproc-server-hs, which in retrospect is probably the wrong place. The reason for this is that this was never implemented in the original library we used, so we followed best practice in adding rather than modifying code.

Recommendation: port functionality from citeproc-server-hs to citeproc-hs, make these tests pass.

Results:

[ERRORED] bibheader EntryspacingDefaultValueOne
Test mode UnknownMode "bibliography-header" not supported

[ERRORED] bibheader EntryspacingExplicitValueZero
Test mode UnknownMode "bibliography-header" not supported

[ERRORED] bibheader SecondFieldAlign
Test mode UnknownMode "bibliography-header" not supported

[ERRORED] bibheader SecondFieldAlignWithAuthor
Test mode UnknownMode "bibliography-header" not supported

[ERRORED] bibheader SecondFieldAlignWithNumber
Test mode UnknownMode "bibliography-header" not supported

BIBSECTION:

This tests processing instructions that allow some bibliography entries to be selected or de-selected based on rules. This is useful if you are writing a word-processor and want to implement a library functionality. It is pretty much useless for our purposes, and seems to me to be way out of scope for a reference formatter. It is part of the citeproc-js API.

All 8 tests fail.

Recommendation: do nothing

Results:

[FAILURE] bibsection Exclude
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book C</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Article B</div>
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book C</div>
</div>]

[FAILURE] bibsection Include
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book C</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Article B</div>
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book C</div>
</div>]

[FAILURE] bibsection Quash
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Article B</div>
<div class="csl-entry">Book A</div>
<div class="csl-entry">Manuscript D</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Article B</div>
<div class="csl-entry">Book A</div>
<div class="csl-entry">Manuscript C</div>
<div class="csl-entry">Manuscript D</div>
</div>]

[FAILURE] bibsection Select
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book C</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Article E</div>
<div class="csl-entry">Book A</div>
<div class="csl-entry">Book B</div>
<div class="csl-entry">Book C</div>
<div class="csl-entry">Book D</div>
</div>]

BUGREPORTS:

This section includes a large number of regression tests for issues that were posted to the citeproc-js mailing lists. It captures a number of conditions that may not have been adequately specified elsewhere. Many of these are relatively trivial, others are very specific.

We currently pass 65/73 (89%). The failures and their reasons are detailed separately below:

[FAILURE] bugreports AffiliationSpoofingDemoPageFullCiteCruftOnSubsequent:

Expected:
[Malone, U.S. Bureau of the Census, <i>Evaluating Components of International Migration: Consistency of 2000 Nativity Data</i>.]

Got:
[Nolan J. Malone, U.S. Bureau of the Census, <i>Evaluating Components of International Migration: Consistency of 2000 Nativity Data</i> (New York: Routledge, 2001).]

This tests a particular work-around implemented in citeproc-js which adds a non-standard CSL feature, namely institutions for authors. It does this in a very awkward way, by adding extra literal authors.

Recommendation: do not implement their hack. If we want authors to have institutions, add a field to Agent, and do this cleanly.

[FAILURE] bugreports ByBy

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Review of <i>My Title</i>, by Beaver Burtle. n.d.</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Review of <i>My Title</i>, by Burtle, Beaver. n.d.</div>
</div>]

This tests that by isn't duplicated (not a problem we ever had). The failure is caused by the author inheriting its form through a stack of substitutions.

Recommendation: fix this by clearing inherited attributes when we enter macros.

[FAILURE] bugreports ChicagoAuthorDateLooping

Expected:
[(Anon.; Anon.; Manstein 1982)]

Got:
[(Manstein 1982 Sep ; [CSL STYLE ERROR: reference with no printed form.]; [CSL STYLE ERROR: reference with no printed form.])]

Not really sure what this is meant to be testing. The error we get appears to be related to representing anonymous authors.

Recommendation: investigate and fix.

[FAILURE] bugreports NumberInMacroWithVerticalAlign

Expected: 
[<sup><i>2</i></sup>; <sup><i>3</i>–<i>5</i></sup>]

Got:
[<sup><i>2</i></sup>; <sup><i>3–5</i></sup>]

This tests how a range of numbers is emitted when the page range is in super-script. We fail it because we wrap the entire range "3–5" in italics, but the expectation is that we would emit each page number formatted separately, and join them with an unformatted separator.

Recommendation: not sure. Maybe implement this behaviour. It would require an amount of reworking of the page range code.

[ERRORED] bugreports Places

Could not parse CSL content. "[Unexpected attributes] year-range-format=\"expanded\""

The CSL includes a non CSL attribute.

Recommendation: do nothing

[FAILURE] bugreports UndefinedNotString

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline"><i>FOO BAR</i> <b>n.d.</b></div>
</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline"><i>FOO BAR</i>.</div>
</div>
</div>]

This tests grouping behaviour. The behaviour it expects is infuriating. It expects the no-date term to be emitted. The term is part of this section:

<group delimiter=" ">
  <text variable="container-title" form="short" text-case="title" font-style="italic"/>
  <group delimiter=", ">
    <text macro="year-date"/>
    <text variable="volume" font-style="italic"/>
    <!-- TODO: Change to page-first when Zotero supports it -->
    <text variable="page" form="short"/>
  </group>
</group>

Groups are meant to suppress all output if a variable is called and non of the called variables has any output. In this case the page and volume variables are called, neither has any output, and thus the whole group is suppressed.

I am reluctant to change this because:

  • I don't really want to implement non-spec behaviour
  • It is not clear why the macro should be emitted.

[FAILURE] bugreports UndefinedStr

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline"> FamilyName G. n.d.</div>
</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline"> FamilyName G. .</div>
</div>
</div>]

Same as UndefinedNotString

[FAILURE] bugreports UriWrapping

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Broder, John M., and Ian Urbina. “All Eyes Turn to Virginia Senate Race.” <i>The New York Times</i>, November 9, 2006, sec. /. <a href="http://www.nytimes.com/2006/11/09/us/politics/09virginia.html?ex=1320728400&#38;amp;en=e65ed62ff1814d9b&#38;amp;ei=5088&#38;amp;partner=rssnyt&#38;amp;emc=rss.">http://www.nytimes.com/2006/11/09/us/politics/09virginia.html?ex=1320728400&#38;amp;en=e65ed62ff1814d9b&#38;amp;ei=5088&#38;amp;partner=rssnyt&#38;amp;emc=rss.</a></div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Broder, John M., and Ian Urbina. “All Eyes Turn to Virginia Senate Race.” <i>The New York Times</i>, November 9, 2006, sec. /. http://www.nytimes.com/2006/11/09/us/politics/09virginia.html?ex=1320728400&amp;en=e65ed62ff1814d9b&amp;ei=5088&amp;partner=rssnyt&amp;emc=rss.</div>
</div>]

This tests for an extra non-spec feature which lets URIs be wrapped in link tags.

Recommendation: do nothing

DISAMBIGUATE:

This group tests disambiguation behaviour. This is important to get right so we should try and pass all of these. Some of them are rather difficult.

We currently pass 63/72 (87%). Individual cases are listed below:

[FAILURE] disambiguate AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Expected:
[Asthma et al. (1990a); Asthma et al. (1990b); Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)]

Got:
[Asthma, Bronchitis, et al. (1990a); Asthma, Bronchitis, et al. (1990b); Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)]

This tests a whole bunch of things:

  • That the last two items can be disambiguated by name expansion alone (success)
  • That the first two items don't get extra names added to them. This contradicts other tests elsewhere, and I cannot find an explanation of when extra names should and should not be preserved.

Not really sure what to do about this - I think hit up the mailing list to find out what the reasoning here is.

[FAILURE] disambiguate BasedOnEtAlSubsequent

Expected:
[(Baur, Fröberg, Baur, et al. 2000<i>a</i>; Baur, Schileyko &#38; Baur 2000<i>b</i>; Doe 2000)]

Got:
[(Baur, Fröberg, Baur, et al. 2000; Baur, Schileyko &#38; Baur 2000; Doe 2000)]

This tests that if a work is ever rendered in the subsequent form, we perform comparisons based on the subsequent and not the first form.

This massively complicates the disambiguation logic. And would take some really thinking to implement without turning everything into spaghetti.

Recommendation: aim to implement.

Back-Ref disambiguation

[ERRORED] disambiguate BasedOnSubsequentFormWithBackref
Could not parse simple: when expecting a Int, encountered String instead

[ERRORED] disambiguate BasedOnSubsequentFormWithBackref2
Could not parse CSL content. "[StyleParseException] the only value allowed for 'disambiguate' is 'true'. Got: check-ambiguity-and-backreference"

[ERRORED] disambiguate BasedOnSubsequentFormWithLocator
Could not parse simple: when expecting a Int, encountered String instead

This is a non-CSL spec feature.

Recommendation: ignore for now.

[FAILURE] disambiguate DisambiguateTrueAndYearSuffixOne

Expected:
[Pollock, 1979
Pollock, 1980]

Got:
[Pollock, 1979a
Pollock, 1980b]

Hypothesis: This tests incremental disambiguation. This is where disambiguating features are added incrementally until a citation is disambiguated. In this case we have:

where year-date is:

In this case the test expects the processor to first add choose/if/date, and then only add choose/if/text if the result is ambiguous. (I think).

[FAILURE] disambiguate IncrementalExtraText

Expected:
[Yost Trisk
John Smith, Book One
John Smith, Book Two
John Smith, Complete Works, ed. 5
John Smith, Complete Works, ed. 6]

Got:
[Yost Trisk
John Smith, Book One, ed. 3
John Smith, Book Two, ed. 4
John Smith, Complete Works, ed. 5
John Smith, Complete Works, ed. 6]

This definitely tests incremental disambiguation. This is probably useful, but very tricky to implement.

Recommendation: think about implementing, as a very low priority.

[FAILURE] disambiguate SkipAccessedYearSuffix

Expected: [

Doe J. His Anonymous Life. http://example.com (accessed 15 December 2000). 1965a
Doe J. His Anonymous Life. http://example.com (accessed 15 December 2000). 1965b
]

Got: [

Doe J. His Anonymous Life. http://example.com (accessed 15 December 2000). 1965
Doe J. His Anonymous Life. http://example.com (accessed 15 December 2000). 1965
]

This expects these two references (ITEM-1 and ITEM-2) to be treated as the same reference with different access dates. This is tricky to implement. Currently our concept of a work as co-terminous with the concept of a reference. This would require a distinction.

[FAILURE] disambiguate YearSuffixAtTwoLevels

Expected:
[Smith, Jones &#38; Brown (1986a); Smith, Jones &#38; Brown (1986b); Smith, Jones, Brown, et al. (1986a); Smith, Jones, Brown, et al. (1986b)]

Got:
[Smith et al. (1986a); Smith et al. (1986b); Smith, Jones, Brown &#38; Green (1986a); Smith, Jones, Brown &#38; Green (1986b)]

Not really sure what this is trying to prove. Why, for example is Smith Jones & Brown expanded fully even when it does not disambiguate the work? Why is Green not added? This is confusing.

Recommendation: email the list.

[FAILURE] discretionary SuppressMultipleAuthors

Expected:
[(2005, 2006)
(Robert Jones 2000)]

Got:
[(2005, John Smith 2006)
(Robert Jones 2000)]

Here the same author is cited twice in a group and suppressed in the first one. The test expects the same author to be suppressed everywhere if suppressed anywhere.

This is really annoying, and adds tonnes of complexity. It is also no more than a minor convenience for the end-user.

INTEGRATION:

[FAILURE] integration CrossCitationIbidOnInsert

Expected:
[Doe v. Roe, 12 U.S. 23, 34 L.Ed. 45 (2001)
Id. at 78, 34 L.Ed. 89
Smith v. Jones, 56 U.S. 67 (2002)]

Got:
[Doe v. Roe, 12 U.S. 23 (2001); Doe v. Roe, 34 L.Ed. 45 (2001)
Doe v. Roe, 12 U.S. 78; Doe v. Roe, 34 L.Ed. 89
Smith v. Jones, 56 U.S. 67 (2002)]

No idea what this is doing. Some legal thing.

LABEL:

[FAILURE] label PageWithEmbeddedLabel

Expected:
[chap. 13]

Got:
[p. ch. 13]

Here the citation has a "page" locator with the value "ch. 13". The processor expects us to parse the locator and turn it into a locator with the value "13" and the label "chapter".

This is fairly easily doable, but adds somewhat unnecessary complexity. It would however allow us to use "chapters" in the refme system, where we only support pages.

Recommendation: implement

LOCALE:

[ERRORED] locale TermInSort

Could not parse CSL content. "[Unexpected attributes] genre=\"radio-broadcast\""

This test uses MLZ features. Very useful ones, but this is not a CSL test.

Recommendation: ignore for now. Implement when we want genres.

LOCATOR:

[FAILURE] locator TrickyEntryForPlurals

Expected:
[Book Title, vol. 1, fol. 186, 8 April 1544]

Got:
[Book Title, pp. vol. 1, fol. 186, 8 April 1544]

This tests what happens when the "page" variable of a reference has the insane content: "vol. 1, fol. 186, 8 April 1544" - namely the label should be suppressed.

Recommendation: maybe implement?

[FAILURE] magic CitationLabelInBibliography

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[Doe65]</div><div class="csl-right-inline">Doe, J.: Book A., 1965.</div>
</div>
<div class="csl-entry">
    <div class="csl-left-margin">[RoNo78a]</div><div class="csl-right-inline">Roe, J. and Noakes, R.: Book A., 1978.</div>
</div>
<div class="csl-entry">
    <div class="csl-left-margin">[RoNo78b]</div><div class="csl-right-inline">Roe, J. and Noakes, R.: Book A., 1978.</div>
</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">
    <div class="csl-left-margin">[Doe65]</div><div class="csl-right-inline">Doe, J.: Book A., 1965.</div>
</div>
<div class="csl-entry">
    <div class="csl-left-margin">[RoNo78a]</div><div class="csl-right-inline">Roe, J. and Noakes, R.: Book A., 1978a.</div>
</div>
<div class="csl-entry">
    <div class="csl-left-margin">[RoNo78b]</div><div class="csl-right-inline">Roe, J. and Noakes, R.: Book A., 1978b.</div>
</div>
</div>]

We pass what this is actually testing (the citation labels) - it does however also include incremental disambiguation logic (see above). This means it expects the citation label to be sufficiently disambiguating to fix the issues that we never need to add it to the year.

Recommendation: Ignore, citation labels are dumb feature.

[FAILURE] magic SubsequentAuthorSubstituteOfTitleField

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Book X (2000)</div>
<div class="csl-entry">----- (2001)</div>
<div class="csl-entry">Book Y (2002)</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Book X (2000)</div>
<div class="csl-entry">Book X (2001)</div>
<div class="csl-entry">Book Y (2002)</div>
</div>]

This tests that when the title substitutes for the author, that we can use that for subsequent substitution.

Recommendation: maybe do it? It would involve significant complexity.

PARALLEL:

We fail all the parallel tests. They seem to be a legal thing. Ignore them all for now

[FAILURE] parallel Bibliography

Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Jim v. Bob, 444 U.S. 400 (1965)</div>
<div class="csl-entry">Smith v. Noakes, 222 U.S. 200, 333 L.Ed. 300 (1965)</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Jim v. Bob, 444 U.S. 400 (1965)</div>
<div class="csl-entry">Smith v. Noakes, 222 U.S. 200 (1965)</div>
<div class="csl-entry">Smith v. Noakes, 333 L.Ed. 300 (1965)</div>
</div>]

[FAILURE] parallel HackedChicago
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">People v. Taylor, 73 N.Y.2d 683, 541 N.E.2d 386, 543 N.Y.S.2d 357 (1989)</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">People v. Taylor, 73 N.Y.2d 683 (1989)</div>
<div class="csl-entry">People v. Taylor, 541 N.E.2d 386 (1989)</div>
<div class="csl-entry">People v. Taylor, 543 N.Y.S.2d 357 (1989)</div>
</div>]

[FAILURE] parallel JournalArticleReverse
Expected:
[Smith v. Noakes, 222 U.S. 200, 333 L.Ed. 300 (1965); Jim v. Bob, 444 U.S. 400 (1965)]

Got:
[Smith v. Noakes, 222 U.S. 200 (1965); Smith v. Noakes, 333 L.Ed. 300 (1965); Jim v. Bob, 444 U.S. 400 (1965)]

[FAILURE] parallel JournalArticleSimple
Expected:
[Jim v. Bob, 444 U.S. 400 (1965); Smith v. Noakes, 222 U.S. 200, 333 L.Ed. 300 (1965)]

Got:
[Jim v. Bob, 444 U.S. 400 (1965); Smith v. Noakes, 222 U.S. 200 (1965); Smith v. Noakes, 333 L.Ed. 300 (1965)]

[FAILURE] parallel TrailingIbid
Expected:
[Smith v. Noakes, 222 U.S. 200, 201, 333 L.Ed. 300, 301 (1965); ibid., 333 L.Ed. 301; Jim v. Bob, 444 U.S. 400, 401 (1966).]

Got:
[Smith v. Noakes, 222 U.S. 200, 201 (1965); Smith v. Noakes, 333 L.Ed. 300, 301 (1965); Smith v. Noakes, 222 U.S. 201; Smith v. Noakes, 333 L.Ed. 301; Jim v. Bob, 444 U.S. 400, 401 (1966).]

UNSORT:

We fail these. They allow for the whole bibliography and for individual citation groups to have a fixed user-defined order.

[FAILURE] unsort BibliographyNosortOption
Expected:
[<div class="csl-bib-body">
<div class="csl-entry">Roe</div>
<div class="csl-entry">Doe</div>
</div>]

Got:
[<div class="csl-bib-body">
<div class="csl-entry">Doe</div>
<div class="csl-entry">Roe</div>
</div>]

[FAILURE] unsort Citation
Expected:
[Roe; Doe]

Got:
[Doe; Roe]

There really isn't a great use-case for this. Ignore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment