Discussion:
Persistent problem with pandoc-citeproc, but only when specifying a CSL citation-style file
Søren ONeill
2015-06-11 11:23:44 UTC
Permalink
I have spent the last two days (being sick) and looking at this very
annoying issue:

*The problem:*
using Rstudio to knit Rmd->HTML (with embedded R-code and
citations/bibliographies):

1. I am successful as long as I dont specify a csl file in the header
2. if I do specify a csl file I get the following error in the console:

pandoc-citeproc: error while reading the XML file: XMLParseError "syntax
error" (XMLParseLocation {xmlLineNumber = 1, xmlColumnNumber = 0,
xmlByteIndex = 0, xmlByteCount = 0})
pandoc: Error running filter /usr/lib/rstudio/bin/pandoc/pandoc-citeproc
Fejl: pandoc document conversion failed with error 83
Execution halted

*The platform:*
I have reproduced this issue on two separate linux installations, including
a fresh ubuntu 14.04 install, with a fresh install of Rstudio and pandoc.
The bibtex and csl files are both available in the same folder as the Rmd
file.

*The investigation:*
As the problem appears (to me) to be related to pandoc-citeproc I have
tried to install newer versions than what is available in the Ubuntu
repositories (1.12):
I tried the latest available binary from the pandoc site -- did not help
I tried installing the haskell platform itself and installing pandoc
(1.14.0.4) and pandoc-citeproc (0.7.2) via cabal -- did not help

I have also tried doing the motions manually instead of via Rstudio:
Knit'ing from Rmd to md via R works flawlessly, but the resulting .md file
pandoc Test_manuscript.md -o Test_manuscript.docx --bibliography
references.bib
pandoc-citeproc: error while parsing the XML string
pandoc: Error running filter pandoc-citeproc

*The suggestions:*
The closest I have come to a solution online is this link
<https://github.com/jgm/pandoc-citeproc/issues/81>
I gather I need to 'pull in hexpat', what ever that means ...
Whether that is the correct solution or not I don't know, but it's
certainly not a very accessible solution to us non-haskellians

*The solution:*
...? ... this is where you come in :-)

*The appendix:*
My Rmd file:
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---

This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio to
generate HTML, Word files and pdf files from an Rmarkdown script.

Particularly, I want the script to include text, embedded R code, citations[
@clar_clinical_2014] from a BibTeX catalogue and graphics.

Can it all be done **in a single file**, I wonder?

```{r}
summary(cars)
```

You can also embed plots, for example:

```{r, echo=FALSE}
plot(cars)
```

Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.

#References

..after knitting with R:
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---

This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio to
generate HTML, Word files and pdf files from an Rmarkdown script.

Particularly, I want the script to include text, embedded R code, citations[
@clar_clinical_2014] from a BibTeX catalogue and graphics.

Can it all be done **in a single file**, I wonder?


```r
summary(cars)
```

```
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
```

You can also embed plots, for example:

![plot of chunk unnamed-chunk-2](figure/unnamed-chunk-2-1.png)

Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.

#References
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c75a5bce-7acc-438f-9d2e-32bf17d118e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
n***@gmail.com
2015-06-11 13:00:41 UTC
Permalink
bmcart.cls? C-L-S?? Is that a CSL (Citation Style Language) style file at
all? Could you try one from
https://github.com/citation-style-language/styles instead?
Post by Søren ONeill
I have spent the last two days (being sick) and looking at this very
*The problem:*
using Rstudio to knit Rmd->HTML (with embedded R-code and
1. I am successful as long as I dont specify a csl file in the header
pandoc-citeproc: error while reading the XML file: XMLParseError "syntax
error" (XMLParseLocation {xmlLineNumber = 1, xmlColumnNumber = 0,
xmlByteIndex = 0, xmlByteCount = 0})
pandoc: Error running filter /usr/lib/rstudio/bin/pandoc/pandoc-citeproc
Fejl: pandoc document conversion failed with error 83
Execution halted
*The platform:*
I have reproduced this issue on two separate linux installations,
including a fresh ubuntu 14.04 install, with a fresh install of Rstudio and
pandoc. The bibtex and csl files are both available in the same folder as
the Rmd file.
*The investigation:*
As the problem appears (to me) to be related to pandoc-citeproc I have
tried to install newer versions than what is available in the Ubuntu
I tried the latest available binary from the pandoc site -- did not help
I tried installing the haskell platform itself and installing pandoc
(1.14.0.4) and pandoc-citeproc (0.7.2) via cabal -- did not help
Knit'ing from Rmd to md via R works flawlessly, but the resulting .md file
pandoc Test_manuscript.md -o Test_manuscript.docx --bibliography
references.bib
pandoc-citeproc: error while parsing the XML string
pandoc: Error running filter pandoc-citeproc
*The suggestions:*
The closest I have come to a solution online is this link
<https://github.com/jgm/pandoc-citeproc/issues/81>
I gather I need to 'pull in hexpat', what ever that means ...
Whether that is the correct solution or not I don't know, but it's
certainly not a very accessible solution to us non-haskellians
*The solution:*
...? ... this is where you come in :-)
*The appendix:*
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---
This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio
to generate HTML, Word files and pdf files from an Rmarkdown script.
Particularly, I want the script to include text, embedded R code,
Can it all be done **in a single file**, I wonder?
```{r}
summary(cars)
```
```{r, echo=FALSE}
plot(cars)
```
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
#References
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---
This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio
to generate HTML, Word files and pdf files from an Rmarkdown script.
Particularly, I want the script to include text, embedded R code,
Can it all be done **in a single file**, I wonder?
```r
summary(cars)
```
```
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
```
![plot of chunk unnamed-chunk-2](figure/unnamed-chunk-2-1.png)
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
#References
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ae968740-024c-4eff-8e4f-299dd8892dfd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 15:38:49 UTC
Permalink
Well spotted. And the bmcart.cls is not an appropriate CSL file.

Unfortunately, it makes no difference ... same outcome with e.g. pain.csl
and european-journal-of-pain.csl
Post by Søren ONeill
I have spent the last two days (being sick) and looking at this very
*The problem:*
using Rstudio to knit Rmd->HTML (with embedded R-code and
1. I am successful as long as I dont specify a csl file in the header
pandoc-citeproc: error while reading the XML file: XMLParseError "syntax
error" (XMLParseLocation {xmlLineNumber = 1, xmlColumnNumber = 0,
xmlByteIndex = 0, xmlByteCount = 0})
pandoc: Error running filter /usr/lib/rstudio/bin/pandoc/pandoc-citeproc
Fejl: pandoc document conversion failed with error 83
Execution halted
*The platform:*
I have reproduced this issue on two separate linux installations,
including a fresh ubuntu 14.04 install, with a fresh install of Rstudio and
pandoc. The bibtex and csl files are both available in the same folder as
the Rmd file.
*The investigation:*
As the problem appears (to me) to be related to pandoc-citeproc I have
tried to install newer versions than what is available in the Ubuntu
I tried the latest available binary from the pandoc site -- did not help
I tried installing the haskell platform itself and installing pandoc
(1.14.0.4) and pandoc-citeproc (0.7.2) via cabal -- did not help
Knit'ing from Rmd to md via R works flawlessly, but the resulting .md file
pandoc Test_manuscript.md -o Test_manuscript.docx --bibliography
references.bib
pandoc-citeproc: error while parsing the XML string
pandoc: Error running filter pandoc-citeproc
*The suggestions:*
The closest I have come to a solution online is this link
<https://github.com/jgm/pandoc-citeproc/issues/81>
I gather I need to 'pull in hexpat', what ever that means ...
Whether that is the correct solution or not I don't know, but it's
certainly not a very accessible solution to us non-haskellians
*The solution:*
...? ... this is where you come in :-)
*The appendix:*
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---
This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio
to generate HTML, Word files and pdf files from an Rmarkdown script.
Particularly, I want the script to include text, embedded R code,
Can it all be done **in a single file**, I wonder?
```{r}
summary(cars)
```
```{r, echo=FALSE}
plot(cars)
```
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
#References
---
title: "Test of R, Rmarkdown & BibTeX"
csl: bmcart.cls
output: html_document
bibliography: references.bib
---
This is a test of R, Rmarkdown and BibTeX in which I hope to use RStudio
to generate HTML, Word files and pdf files from an Rmarkdown script.
Particularly, I want the script to include text, embedded R code,
Can it all be done **in a single file**, I wonder?
```r
summary(cars)
```
```
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
```
![plot of chunk unnamed-chunk-2](figure/unnamed-chunk-2-1.png)
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
#References
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3f896d59-c77e-4927-a5c3-2b5a2c1d7230%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 15:39:38 UTC
Permalink
Well spotted. And the bmcart.cls is not an appropriate CSL file.

Unfortunately, it makes no difference ... same outcome with e.g. pain.csl
and european-journal-of-pain.csl
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6d29b11b-4c7e-42c6-bc9a-944c12c948a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 15:50:28 UTC
Permalink
...well not exactly the same actually ... the error message is different:

pandoc-citeproc: error while reading the XML file: XMLParseError "not
well-formed (invalid token)" (XMLParseLocation {xmlLineNumber = 32,
xmlColumnNumber = 70, xmlByteIndex = 2396, xmlByteCount = 0})
pandoc: Error running filter /usr/lib/rstudio/bin/pandoc/pandoc-citeproc
Fejl: pandoc document conversion failed with error 83
Execution halted

Is anyone else running the Rstudio + pandoc combination on ubuntu (or
derivatives)?
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/30a54d1b-7286-43c3-83a7-b7ad64ee8704%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Joost Kremers
2015-06-11 16:23:19 UTC
Permalink
Post by Søren ONeill
pandoc-citeproc: error while reading the XML file: XMLParseError "not
well-formed (invalid token)" (XMLParseLocation {xmlLineNumber = 32,
xmlColumnNumber = 70, xmlByteIndex = 2396, xmlByteCount = 0})
pandoc: Error running filter /usr/lib/rstudio/bin/pandoc/pandoc-citeproc
Fejl: pandoc document conversion failed with error 83
Execution halted
The error message is telling you that there is a parse error in an XML
file. Although it's not clear from the message, the relevant file is
your csl file. (It's the only one that's in XML; that also explains why
you only get the error when you use pandoc-citeproc, because when you
leave it out, the csl file isn't read at all.)

So check your csl file, there must be an error in it somewhere.
--
Joost Kremers
Life has its moments
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87egliz020.fsf%40fastmail.fm.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 15:56:11 UTC
Permalink
And a further observation: in the original post, I mention that doing the
process manually rather than through RStudio also fails, even when not
specifying a csl ... if the .md file contains a csl line in the header, or
I specify a csl in the pandoc cli call, it fails. Only if there is no csl
header line and no csl parameter in the cli pandoc call will it complete
successfully.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f8dccff1-e3de-4740-ab06-a65468acb88b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 16:07:15 UTC
Permalink
The files I use, if anyone wants attempt to reproduce the error.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3388eaa8-df66-40df-90f7-a30e8dd7b09e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
n***@gmail.com
2015-06-11 16:18:26 UTC
Permalink
You seem to have saved the whole
https://github.com/citation-style-language/styles/blob/master/pain.csl
landing page.

Try
https://raw.githubusercontent.com/citation-style-language/styles/master/pain.csl
instead.
Post by Søren ONeill
The files I use, if anyone wants attempt to reproduce the error.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fbf40f34-50b7-41c9-8744-52dc495c69ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Søren ONeill
2015-06-11 16:52:19 UTC
Permalink
I am little bit embarrassed now ... I will have to put it down to being
under-the-weather with sneezing, phlegm and fever for the last 2 days How
stupid am I?

You're absolutely right: first error was to use the wrong file (cls instead
ogf csl) and second to download the wrong file.

I now have managed (TA-DA !!) to download a simple textfile from the
internet and put it in the right folder on my local drive ... and all is
well :-)

Thank to you both, nickba and joost, for spotting what should have been
obvious to me -- sorry to have spent (I wont say wasted) your time.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f44e8d8f-b88a-4d66-bad6-87854b5e8e78%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Joost Kremers
2015-06-11 20:59:09 UTC
Permalink
On Thu, Jun 11 2015, Søren ONeill <***@gmail.com> wrote:
[...]
Post by Søren ONeill
Thank to you both, nickba and joost, for spotting what should have been
obvious to me -- sorry to have spent (I wont say wasted) your time.
No worries. It's not like we've never been there...
--
Joost Kremers
Life has its moments
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+***@googlegroups.com.
To post to this group, send email to pandoc-***@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87616ut10i.fsf%40fastmail.fm.
For more options, visit https://groups.google.com/d/optout.
Loading...