sru url issues

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

sru url issues

Bruce D'Arcus-3
Not sure if this is my bug, or Saxon's.  Any ideas?

  FODC0005: Invalid URI {http://polaris.ipoe.uni-kiel.d...} - base
{file:/Users/darcusb/Projects/c...}
Error on line 115 of file:/Users/darcusb/Projects/citeproc/xsl/citeproc.xsl:
  FODC0005: Failed to load document
  http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=bib.citekey%20any%20"Ackley1990SnowCover"&operation=searchRetrieve&recordSchema=mods&recordPacking=xml&startRecord=1&maximumRecords=9999&x-info-2-auth1.0-authenticationToken=email=[hidden email]

Bruce


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 13-Jun-2005 at 16:19 -0400 Bruce D'Arcus wrote:

> Not sure if this is my bug, or Saxon's.  Any ideas?
>
>   FODC0005: Invalid URI {http://polaris.ipoe.uni-kiel.d...} - base
> {file:/Users/darcusb/Projects/c...}
> Error on line 115 of file:/Users/darcusb/Projects/citeproc/xsl/citeproc.xsl:
>   FODC0005: Failed to load document
>  
> http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=bib.
> citekey%20any%20"Ackley1990SnowCover"&operation=searchRetrieve&recordSchema=mods&recordPacking=
> xml&startRecord=1&maximumRecords=9999&x-info-2-auth1.0-authenticationToken=email=[hidden email]

Your URL works well for me if If I paste it into a browser.

Have you tried escaping the quotes (" -> %22)?

Does the error also occur if you try a very simple (but still valid)
query like this one:

  http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
On Jun 13, 2005, at 6:18 PM, Matthias Steffens wrote:

> http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1

No.  Try it with wget.  It seems there's something weird with
redirecting??  Browsers handle it, but not wget or saxon.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 13-Jun-2005 at 18:30 -0400 Bruce D'Arcus wrote:

> On Jun 13, 2005, at 6:18 PM, Matthias Steffens wrote:
>
> > http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1
>
> No.  Try it with wget.  It seems there's something weird with
> redirecting??  Browsers handle it, but not wget or saxon.

Could again the dot in "1.1" the culprit such that the server does
only see "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1"

Have you tried escaping the dots?

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
In reply to this post by Bruce D'Arcus-3
On 13-Jun-2005 at 18:30 -0400 Bruce D'Arcus wrote:

> On Jun 13, 2005, at 6:18 PM, Matthias Steffens wrote:
>
> > http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1
>
> No.  Try it with wget.  It seems there's something weird with
> redirecting??  Browsers handle it, but not wget or saxon.

Using GNU Wget 1.9.1 this works for me:

  wget -O srutest.xml "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1"

If I don't specify a output file name explicitly I do get "Can't
write to file ... (file name too long)" errors, though. Output to
standard output does also work for me:

  wget -O - "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1"

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
On Jun 13, 2005, at 7:08 PM, Matthias Steffens wrote:

> Using GNU Wget 1.9.1 this works for me:

The problem is the quote, which apparently is not valid in uris, and
Saxon is pretty strict about that.  For some reason, I'm not getting
output results though.  Must explore some more.

BTW, would be nice if you output in utf-8.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
In reply to this post by Matthias Steffens
Try this:

wget -O -  
"http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=bib.  
citekey%20any%20%3DAckley1990SnowCover%3D&operation=searchRetrieve&recor
dSchema=mods&recordPacking=xml&startRecord=1&maximumRecords=9999&x-
info-2-auth1.0-authenticationToken=email=[hidden email]"

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
In reply to this post by Bruce D'Arcus-3
On 13-Jun-2005 at 20:43 -0400 Bruce D'Arcus wrote:

> BTW, would be nice if you output in utf-8.

The database admin can choose whether he wants to store data in
utf-8 or in latin1. If it's utf-8, then all output will be utf-8,
otherwise it will be latin1. So output is currently based on the
admin's choice and we don't attempt to convert data on the fly right
now.

AFAIK, the XML specification requires that all parsers support both
the UTF-8 encoding (by default), and the ISO-8859-1 character set. So
would this pose any problems to you if you receive latin1 encoded
data?

Thanks, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
In reply to this post by Bruce D'Arcus-3
On 13-Jun-2005 at 20:46 -0400 Bruce D'Arcus wrote:

> Try this:
>
> wget -O -  
> "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=bib.  
> citekey%20any%20%3DAckley1990SnowCover%3D&operation=searchRetrieve&recor
> dSchema=mods&recordPacking=xml&startRecord=1&maximumRecords=9999&x-
> info-2-auth1.0-authenticationToken=email=[hidden email]"

How is this supposed to work? Maybe your URL got garbled? Your email
came thru with the cite key being enclosed by '%3D' which is the
equals sign:

  bib.citekey any =Ackley1990SnowCover=

This isn't valid CQL, is it? Using '%22' would have worked, though:

wget -O - "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1
&query=bib.citekey%20any%20%22Ackley1990SnowCover%22&operation=
searchRetrieve&recordSchema=mods&recordPacking=xml&startRecord=1
&maximumRecords=9999&x-info-2-auth1.0-authenticationToken=email=
[hidden email]"

At least it works here. Infact, also the originally given query works
for me, using wget, lynx or whatever browser:

wget -O - "http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1
&query=bib.citekey%20any%20%22Ackley1990SnowCover%20Broecker1997Science
%20Granskog2004Baltic%20Mock2002Hydrobiologia%20Simstich2003MarMicropaleontol
%20Thomas2003SeaIce%22&x-info-2-auth1.0-authenticationToken=email=
[hidden email]"

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
In reply to this post by Bruce D'Arcus-3
On 6/14/05, Matthias Steffens <[hidden email]> wrote:

> This isn't valid CQL, is it? Using '%22' would have worked, though:

Oops, was using the wrong escape.  

Anyway, now ... it works!  Here's the output:

===========

CiteProc Test

Introduction

A citation with page number detail: (Broecker 1997, 23–24).

References

Broecker, WS 1997. Thermohaline Circulation, the achilles heel of our
climate system: Will man-made CO[sub:2] upset the current balance?,
Science278: 1582–88available from: Andrea Lorenz
([hidden email]), .

============

BTW, aside from some punctuation problems I need to look at (probably
in the CSL file) this points out the problem with our inconsistent use
of the location element.  I suggest maybe it'd be better for you to
use recordInfo for these data?  If you read thee MODS docs, they
define location as:

"location" identifies the institution, repository holding the resource
or a remote location in form of a URL from which it is available.

The key part is the last bit, and is meant to suggest either online
resources (think of a newspaper article with a url) or physical
locations of unpublished manuscripts.

Question: do you want me to include these two simple refbase xslt and
xml files in the distribution (as examples), or not?

Bruce
Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 14-Jun-2005 at 7:25 -0400 Bruce D'Arcus wrote:

> Anyway, now ... it works!  Here's the output:

Cool! That's good news.

> ===========
>
> CiteProc Test
>
> Introduction
>
> A citation with page number detail: (Broecker 1997, 23–24).
>
> References
>
> Broecker, WS 1997. Thermohaline Circulation, the achilles heel of our
> climate system: Will man-made CO[sub:2] upset the current balance?,
> Science278: 1582–88available from: Andrea Lorenz
> ([hidden email]), .
>
> ============
>
> BTW, aside from some punctuation problems I need to look at (probably
> in the CSL file) this points out the problem with our inconsistent use
> of the location element.  I suggest maybe it'd be better for you to
> use recordInfo for these data?  If you read thee MODS docs, they
> define location as:
>
> "location" identifies the institution, repository holding the resource
> or a remote location in form of a URL from which it is available.

That's exactly how we use it, IMHO. In our case, a (possibly remote)
person with an own collection of physically available papers
represents a repository holding the resource. Anyone interested to
get a copy can then contact this person (e.g. via email). I don't see
how we abuse the meaning of the location element.

> The key part is the last bit, and is meant to suggest either online
> resources (think of a newspaper article with a url) or physical
> locations of unpublished manuscripts.

The text you quoted seems definitively more generic to me. Of course,
if your interpretation is true, we should change it (and I'll happily
do so). But I'd like to have this confirmed first.

From the explanation of elements for recordInfo at the MODS site I
don't see an element where our "physically-available-from-person"
information would fit.

> Question: do you want me to include these two simple refbase xslt and
> xml files in the distribution (as examples), or not?

Yes, I'd welcome this!

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
On Jun 14, 2005, at 8:19 AM, Matthias Steffens wrote:

> That's exactly how we use it, IMHO. In our case, a (possibly remote)
> person with an own collection of physically available papers
> represents a repository holding the resource. Anyone interested to
> get a copy can then contact this person (e.g. via email).

OK.  So in that case, it would be appropriate to have the "available
from John Doe <[hidden email]>" in the citation.

I was thinking you were using it to store information about the user's
copy or something.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 14-Jun-2005 at 10:53 -0400 Bruce D'Arcus wrote:

> On Jun 14, 2005, at 8:19 AM, Matthias Steffens wrote:
>
> > That's exactly how we use it, IMHO. In our case, a (possibly
> > remote) person with an own collection of physically available
> > papers represents a repository holding the resource. Anyone
> > interested to get a copy can then contact this person (e.g. via
> > email).
>
> OK.  So in that case, it would be appropriate to have the
> "available from John Doe <[hidden email]>" in the citation.
>
> I was thinking you were using it to store information about the
> user's copy or something.

Maybe we (or at least I) have still a misunderstanding here since I
don't really see any difference between your last sentence an my
statement above. The "physically available paper" is indeed only a
copy of the artcile, not the original (say, in a public library).

But the important point is, that in the scope of the database (quite
often representing an inventory about all articles being physically
available within one institution) such a copy *is* the physical
location from where yet another copy can be retrieved. It is very
important to know where one can get the particular paper.

However, when citing records, people wouldn't want this information
to occur along with the citation. This is different from the case of
a newspaper article being available from http://...

I'm happy to put this information elsewhere if there's another place
that makes sense.

Thanks again, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
On Jun 14, 2005, at 11:33 AM, Matthias Steffens wrote:

> Maybe we (or at least I) have still a misunderstanding here since I
> don't really see any difference between your last sentence an my
> statement above. The "physically available paper" is indeed only a
> copy of the artcile, not the original (say, in a public library).
>
> But the important point is, that in the scope of the database (quite
> often representing an inventory about all articles being physically
> available within one institution) such a copy *is* the physical
> location from where yet another copy can be retrieved. It is very
> important to know where one can get the particular paper.

I think you rightly point out some ambiguity in this.  Do you feel like
posting a question about it to the MODS list?  People there might have
some ideas of how to handle this.

It might be worth noting that location does have an authority
attribute, whereby in theory you could do:

<location authority="refbase">...

... and citeproc could just not process locations with an authority
attribute.  That seems a little awkward perhaps.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 14-Jun-2005 at 11:50 -0400 Bruce D'Arcus wrote:

> I think you rightly point out some ambiguity in this.  Do you feel
> like posting a question about it to the MODS list?  People there
> might have some ideas of how to handle this.

Yes, I can try but I'm not sure if I'll be able to understand their
answers. Quite often this list is just too techyy for me. :-/

> It might be worth noting that location does have an authority
> attribute, whereby in theory you could do:
>
> <location authority="refbase">...
>
> .... and citeproc could just not process locations with an
> authority attribute.  That seems a little awkward perhaps.

Well, it would be fine for me.

Matthias


Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Bruce D'Arcus-3
On Jun 14, 2005, at 12:28 PM, Matthias Steffens wrote:

> Yes, I can try but I'm not sure if I'll be able to understand their
> answers. Quite often this list is just too techyy for me. :-/

I'd basically just explain the issue, and then ask "how should I code
this?"

If you prefer, I can do it.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: sru url issues

Matthias Steffens
On 14-Jun-2005 at 13:55 -0400 Bruce D'Arcus wrote:

> On Jun 14, 2005, at 12:28 PM, Matthias Steffens wrote:
> > Yes, I can try but I'm not sure if I'll be able to understand
> > their answers. Quite often this list is just too techyy for me.
>
> I'd basically just explain the issue, and then ask "how should I
> code this?"

Ok, haven't found time to post a message yet.

> If you prefer, I can do it.

Well, I'd appreciate that! :-)

And please let me know, if you encounter any other problems with the
refbase MODS XML output. Your testing has helped us already quite a
bit to fine tune our MODS output which I appreciate!

Thanks, Matthias