SRU Implementors Group Meeting/Integration Workshop
MEETING REPORT, March 1-2, 2006
Released March 29, 2006
AGENDAS: Implementors
Group Meeting - Integration Workshop
LINKS: Workshop
Presentations - Version
1.2 Changes
CQL Modifiers
There will not be a change to CQL to add index modifiers. (Nor will there
be a change to add term modifiers. Term modifiers were not part of the
proposal but were discussed in some detail, and the idea was eventually
abandoned.) Thus all modifier that would apply to the index, relation,
or term, will be carried as relation modifiers. (Boolean modifiers
will still be boolean modifiers.)
MODS context set
We will develop a bibliographic context set, whose name will be "bibliographic"
(short name 'bib'), not "mods". It will (as in the proposal)
be based on MODS semantics.
The bibliographic set will incorporate all that's useful from the Bath
context set (and Bath will be deprecated).
The MODS context set proposal can be used as a basis for this bibliographic
set, but it needs significant work. A working group will be assigned
to develop this set.
Tentative decisions (with regard to the original proposal, and given
that there will be no index modifiers)
- Flatten "type"; for example:
title/type=abbreviated would instead be titleAbbreviated;
title/type=uniform would instead be titleUniform;
etc.
- authority will become a relation modifier.
- part: will either be flattened (as above) or become a relation modifier.
Marc Context set
change the following:
- marc tag
- ddd --> aaa (i.e. alphanumeric rather than digits)
- "up to" three characters, not "three" (fixed)
- indicator
- 0-9 (not just 1 or 2)
- subfield: any character (not necessary alphanumeric)
There will not be an OpenURL context set. Instead, there will be an OpenURL
profile.
The profile will prescribe a mapping from bibliographic indexes to OpenURL
keys. This will be a complex task and hopefully will be taken on by the
bib group.
The premise behind the context set proposal had been that a resolver,
upon receiving an OpenURL request, might want to search via SRU as part
of the resolution process. The theory was that the resolver could take
the keys from the received request and turn them directly into search
indexes, which would make the task of creating the search much simpler.
However, this would not be useful unless the sever understands and supports
those indexes. Since we are planning to develop bibliographic indexes
(the bib set) and we want servers to support them, it would add too much
complexity and cause confusion to also ask that severs also support the
suggested OpenURL set.
So instead, the resolver should map the keys to bibliographic indexes,
and the profile will specify that mapping. This will make the process
somewhat more difficult for the resolver than if it could simply use the
keys as indexes, however the process will be simpler than it is today,
because of the availability of a mapping, which today does not exist.
The profile may also specify how an SRU response can facilitate the client
process of formulating an OpenURL. This corresponds to a scenario somewhat
the opposite of above. Above, a resolver receives an OpenURL and
wants to formulate an SRU request. In this case an SRU client receives
a record and wants to create an OpenURL (where the object described by
that record will be the referent). A client could use SRU to find an item
of interest, then request the record for that item in the appropriate
OpenURL schema -- for example: http://www.openurl.info/registry/docs/xsd/info:ofi/fmt:xml:xsd:book
for books, or http://www.openurl.info/registry/docs/xsd/info:ofi/fmt:xml:xsd:journal
for journals -- and use it to formulate an OpenURL request.
We will solicit advice from the OpenURL community in developing this
profile.
Proximity
Proximity units (other than in the cql set) should be treated such that
"unit" itself is a value in a context set, rather than the unit
value being a value in a context set. For example suppose you want
to define "street" as a proximity unit, within context set 'xyz'.
Do it like this: prox/xyz.unit="street"
rather than this: prox/unit=xyz.street
Proximity units 'word', 'sentence', 'paragraph', which are included in
the cql set, will be explicitly undefined.
All other proximity issues are deferred until version 2.0.
Sort
The sort proposal
was accepted, with the provision that there needs to be additional prose
to describe case insensitivity better.
SRU via POST
The SRU via POST proposal is accepted. It will be referred to as "SRU
Post".
See http://www.loc.gov/standards/sru/sru-post.html
Open Search
Advantages of SRU vs. OS:
- cql
- schemas
- scan
- diagnostics
- stability
The strategy we discussed is to make OpenSearch requests legitimate SRU
requests. Then an SRU-friendly OS server will be able to do something
intelligent when it gets an SRU-loaded OS request.
For example, say the following three parameters were to occur in
an OS request
- query="alice lewis"
- x-os-title=alice
- x-os-creator=lewis
The latter two are (syntactically) legitimate SRU parameters where "x-"
indicates an extension. An SRU-friendly OS server might combine these
into a CQL query. An ordinary OS server will ignore them (because
it ignores whatever it doesn't understand) and will just process the two
search terms.
With OpenSearch 1.1 we note that SRU's 'startRecord' is the same as OS
'startIndex' and SRU's 'maximum Records' is the same as OS 'count'. Further,
with appropriate namespace declarations an OS declaration:
<url type="application/rss+xml"
template="http://example.com/?query={searchTerms}
&startRecord={startIndex}&
maximumRecords={count}&x-format=rss"/>
Will correctly describe a valid SRU query. The contents of {searchTerms}
would have to be a valid CQL query, which would require at least simple
parsing of {searchTerms}.
If we want to avoid CQL parsing, a valid OS declaration would be:
<url type="application/rss+xml"
template="http://example.com/?query={searchTerms}
&x-os-title={dc:title?}&x-os-creator={dc:creator?}
& amp;startRecord={startIndex}&
maximumRecords={count}&x-format=sru"/>
OS 1.0 (single-field) servers could probably use the x- workaround, but
perhaps everybody would be better off if they upgraded to the OS 1.1 (multi-field)
standard. Standard guidance for mappings between, for example, x-os-creator
and {dc:creator} would be good. It is likely possible to auto-generate
a reasonable SRU Explain record from an OS description.
In any implementation merging SRU and OS, the two main issues are CQL
parsing and response format.
We can provide generic solutions for simple parsing of CQL, or use the
x- workaround to avoid this entirely.
However, response format is different between the two standards. OS uses
lightly modified RSS and SRU a wrappered variety of schemas, one of which
might be RSS. Implementors of OS servers would have to go through their
code and switch on x-format to create the relevant wrappers and namespaced
fields. This is likely not a very burdensome overhead for someone who
has already implemented OS.
OAI Profile
An OAI over SRU profile will be defined. It will specify that a server
support the following three indexes:
- rec.identifier
- rec.lastmodificationDate
- CollectionIdentifier
An extension will be defined so that "extra data" may be returned
-- the following three elements (corresponding to the above three):
- oai:identifier
- oai:datestamp
- oai:setSpec
Copyright/License
There is currently no copyright or license indication in the spec.
We will investigate putting a liberal copyright, unlimited reproduction,
with a Creative Commons type license. There must be sufficient ownership,
however, to prevent someone from claiming that a modified version is "SRU
2.1" for example. More discussion is needed on this.
XPath
XPath will be relegated to an extension, and it will become optional.
Record-Id
The record-id
proposal to add a recordIdentifier element as an optional field to
the record structure, was approved. Semantics:
"This element contains a persistent, opaque, unique identifier for
this record within this database, which can be subsequently used to retrieve
the same record using a search on the 'rec.identifier' index. The identifier
is not required to be globally unique, and nothing may be assumed about
its structure."
Base URL in Response
The base URL will be included (optional, but strongly recommended) in
the SRU response, within the echoed query (at the end).
Record Hits
The following search information will be incorporated (as optional elements)
into XCQL, for each subquery:
- hits (number of records that matched the subquery)
- term hits, for each term (number of occurrences of the term)
- diagnostic(s)
- recommended subquery
By "subquery", we mean that this information could apply for
any search clause, simple or complex. For example, for:
(A and B) and C
The following are subqueries:
- A
- B
- (A and B)
- C
- (A and B) and C
Z39.92
Z39.92 will replace the current Explain specification in the next SRU
version (1.2 or 2.0).
CQL Name Change
Participants agreed to a suggestion to change the name of CQL from "Common
Query Language" to "Contextual Query Language".
"Don't Care about Record Count"
An extension will be defined to allow the client to indicate that it
does not care whether or not the server includes the parameter numberOfRecords
in the response. (This will mean making the parameter optional.)
The reason for this is the concern that in some environments, counting
the records accurately is expensive.
"Number of Records Approximate"
A diagnostic will be defined to indicate that the number of records indicated
by the numberOfRecords parameter is approximate.
Implementation Id Superceded
An earlier decision (June 2005) to add an implementationId parameter
has been superceded by the addition of the base URL (described above).
So that parameter is no longer necessary.
Standardization
The basic standardization plan presented was approved in principal, to
take SRU to OASIS. Included along with SRU would be:
- CQL
- Scan
- the Explain Operation (but not the Explain spec itself)
- mappings
Mappings would be:
- SRU (i.e. url via http get)
- SRU over SOAP (i.e. SRW)
- SRU Post
Thus SRW would be renamed "SRU over SOAP".
Note that SRU Record Update would not be part of this process.
There is a suggestion that after the OASIS process, we should fast track
in NISO, and after that, fast track in ISO.
We need three OASIS members to initiate the process. Oxford is one. LC
is currently trying to join. Another possibility is University of
Manchester.
Next Version
The OASIS process will produce SRU 2.0. In the interim, version 1.2 may
be released, and it will be the input to the process. (If it is decided
not to release a version 1.2, then 1.1 will be the input.)
See List
of Changes for Version 1.2.
|