SRU Implementors/Ed. Board Meeting
June 18, 2007 - Library of Congress
Meeting Report - June 27 
Attendees - Topics - Action
Items
Attendees
- Rebecca Guenther, NISO
- Larry Dixson, LC
- Ed Summers, LC
- Dan Chudnov, LC
- Ryan Scherle, Indiana University
- Pat Case, CRS
- Ardie Bausenbach, LC
- Ray Denenberg, SRU Ed. Board
- Rob Sanderson, University of Liverpool
Topics
OASIS Process
The meeting began with a review of the OASIS process. See Charter.
The name of the Technical Committee is "OASIS Search Web
Services Technical Committee", abbreviated as "Search WS TC" and its
purpose is to define Search and Retrieval Web Services, combining current
and ongoing web service activities. The scope includes:
- Search/Retrieve
- Query
- Sorting
- Record Retrieval
- Index Browsing
One or more application profiles will be developed, not necessarily
(and most likely not) by the TC, but within the appropriate community,
for example, biblipgraphic, e-learning, geospatial. The work will involve
semantic description of search services but will build upon existing
work (e.g. NISO Z39.92) rather than define new descriptions, and will seek
input from abstract API initiatives such as OKI, ZOOM, and
SQI. However the development or standardization of Abstract APIs is out
of scope. SRU and CQL will be used as the starting points. The
expected deliverables are service definitions, schemas, interface specifications
(POST, GET, SOAP), query language definition, and at least one community
defined profile.
The first meeting of the of the TC will be held by teleconference, July
18. Work will be carried out primarily by email and teleconference calls,
possibly every two weeks, with face-to-face meetings perhaps once or
twice a year. Following the first several calls, an initial face-to-face
meeting, perhaps two days, or one and a half, may be held. The TC will
determine its own schedule once organized.
There will be a TC listserv as well as an additional implementors listserv. The
TC listserv archive will be publically accessible (only TC members can
post). The implementors listserv will be open to anyone (even non-OASIS
members). [Need to determine procedures to join the open list.]
See Joining the
OASIS Search Web Services Technical Committee.
CQL bibliographic searching
The August
7, 2006 draft (with minor revsion June 11, 2007) was reviewed,
and the following changes were agreed upon.
- Remove bib.titleSub. (It is redundant. There is a sub modifier.)
- Add dc.creator to list of dc elements (currently contributor and
publisher) to search on rather than bib.name with a role modifier.
- Make marcrelator the default for bib.roleAuthority ( MARC
Value List for Relators and Roles)
- Make 'w3cdtf' the default authority for bib date indexes.
- Add dc.date, for searching on non-specific dates.
- Combine Resource Type and Genre into Resource Type/Genre. The
indexes will be dc.type and bib.genre, with modifier bib.typeauthority.
- Make the default "server defined" for bib.languageAuthority.
Guidance provided by RFC 3066 is recommended.
openURL Profile
There are three possible use cases for an OpenURL profile.
OpenURL search points - index or mapping?
Use case. An OpenURL resolver, upon receiving an OpenURL
request, might want to search via SRU as part of the resolution process.
The resolver could take the keys from the received request, map the keys
to bibliographic indexes, and formulate an SRU request.
Background: Prior to the March 2006 meeting (more than a year ago)
a set of OpenURL indexes had been proposed. At the meeting, it was the
consensus that instead of defining explicit indexes, a mapping from the
desired search points to bib indexes would be preferable as it seemed
unlikely that the indexes would be implemented. However, in discussion
preceding the recent meeting (June 2007) it was suggested that the sample
mappings are too complicated and that simple indexes would be preferable,
thus in effect suggesting that the earlier decision (March 2006) be reversed.
The meeting participants seem to feel that a mixed approach is best.
Some indexes need to be defined because the alternative mappings are
too complicated. On the other hand some of the openURL search points
map well to bib indexes.
The next step is to determine what are the useful search points. An
initial set is listed in the document Searching
on OpenURL Keys.
SRU to OpenURL
Use Case: An SRU client receives a record and wants to create
an OpenURL, where the object described by that record will
be the referent. A client could use SRU to find an item of interest,
then request the record for that item in the appropriate OpenURL schema
-- for example: http://www.openurl.info/registry/docs/xsd/info:ofi/fmt:xml:xsd:book for
books, or http://www.openurl.info/registry/docs/xsd/info:ofi/fmt:xml:xsd:journal for
journals -- and use it to formulate an OpenURL request.
SRU as an OpenURL Application
Use case: Rob and Dan will write this up.
OAI Profile for SRU
This part of the report to be rewritten (Rob)
The profile would be roughly based on the ideas presented in the Sanderson/Young/LeVan
2005 DLIB article SRW/U
with OAI: Expected and Unexpected Synergies, and the following summary
is based on that article.
-
SRU Interfaces to OAI Aggregated Data
allow the data harvested via OAI to be searched via SRU.
- OAI Interfaces to SRW Provided Data
building OAI on top of SRW..
OAI has some requirements that SRU is not required to support, so these
features can be profiled:
- Three Indexes:
- oai.identifier: a unique identifier for each record in the database
- oai.datestamp: date/time the record was added or changed
in the database
- oai.set: browsable via the scan operation, to support selective
harvesting of records
- an extension to provide an extraRecordData element with
an oai:header fragment to include the identifier, dateStamp, and
setSpec.
This would provide support for the following OAI functions:
- Identify: generated from selected parts of an SRU Explain
response.
- ListMetadataFormats: from schemaInfo of the Explain response.
- ListSets:from an SRU Scan of the oai.sets index.
- ListRecords and ListIdentifiers: from an SRU Search/Retrieve
against the oai.datestamp and oai.set indexes.
- GetRecord: from an SRU Search/Retrieve against the oai.identifier
index.
- OAI Retrieval of SRW Discovered Data
In OAI, sets are an "optional construct for grouping items
for the purpose of selective harvesting" , predefined, but left
up to the repository to design and describe. SRU has dynamic
sets: i.e result sets. If a server had both SRU and OAI interfaces
to the same collection, a search could be performed in SRU creating
a set. Via extension metadata, information about the search could also
be sent at the same time, such as a suggested human readable name and
description. Once the result set has been created, it could be automatically
exposed in the OAI interface for retrieval.
Holdings Schema
(This section was supplied by Janifer Gatenby)
The ISO Holdings Schema (ISO 20775) is designed to replace the Z39.50/SRU
holdings and OPAC schemas. This standard differs from many other holdings
standards in that it is primarily designed to be used in search responses
rather than for reporting purposes. As a response schema, it includes
relatively static and dynamic information in combination. The dynamic
information comprises availability and policy information that may differ
depending on the requester and also usage history. It covers all
holdings, physical and electronic. Another important feature
of the schema is that it includes a summary section for a group of "interchangeable
copies" that can be readily parsed and displayed indicating availability
and policy (e.g. terms of delivery). The summary is flexible enough
to cover multiple definitions of "interchangeability" depending
on user needs, e.g. multiple copies (physical and digital) of a book
or article, multiple copies of various different editions of a work,
and multiple copies of different works in a result set. The schema
includes detail about holdings and an optional section about the resource
or group of resources to which the holdings pertain. Thus the schema
may be used standalone or it may be used as a fragment of a larger schema,
e.g. MODS or ONIX. Two example scenarios:
- A query requests bibliographic and holdings detail be returned in
the response. The results are sent as MODS records with Holdings
schema embedded in each record to include holdings (the holdings schema
does not include a redundant resource section).
- A query requests bibliographic details which are returned in MODS. A
follow on query requests holdings detail for the records in the previous
result set. In this case the ISO Holdings Schema is used with
the bibliographic identifiers in the resource section.
The standard is long awaited and approval is expected by the end of
2007. The first attempt at a holdings schema that included item
availability was the Z39.50 OPAC schema. The Z39.50 holdings schema
was an attempt to supersede this OPAC schema but it was complicated,
little understood and very sparsely implemented as a consequence. In
2004, ISO formed a working group to create a holdings schema, overcoming
the limitations of the Z39.50 OPAC and Holdings Schemas. After
a slow start the group re-energized in 2006, and an XML version of the
schema is available for testing purposes at:
http://oclcpica.org/?id=1013&ln=uk.
Record metadata
A Specification for Requesting
Record metadata via SRU has been developed. As part of this
work, an XML namespace (draft) has been developed for and there is
a draft Namespace
Information Page. The 'rmd' schema needs to be developed.
Agreements
- The use of the expression "administrative metadata" will
be struck from the document. "Record metadata" will be used
instead.
- MODS elements from recordInfo will be added.
- Some of the elements from rec 1.1 are missing and will be added.
Limitation if Identifier and Result Set not Available
If a client wants to retrieve record metadata for a specific record,
and if it knows either the record identifier or the result set position,
it can request the record by identifier or result set postition specifying
the rmd schema, or some other record metadata schema. (And if the
client has already retrieved the record, and if the record has an identifier,
then the client knows the identifier if SRU 1.2 is used, because the
identifier is now part of the record response structure.)
However, if the record does not have an identifier, and if the server
does not support result sets, then neither mechanism (record identification
via identifier or result set position) is available. In that case,
the only way to retrieve the record metadata is by explicitly requesting
that it accompany the record data. The client must request it via extraRequestData,
and the server supplies it via extraRecordData.
Add discussion of this limitation to the record metadata document.
Record Update
See June 8
update and Namespace
Information Page. This is awaiting completion of the schema
and wsdl bindings, and will be completed soon. Record Update
will not be part of the OASIS work, but could be part of the bibliographic
profile.
Bibliographic Profile for
SRU
The premise behind a bibliographic profile for SRU
is that work on the base protocol for SRU 2.0 will be done in OASIS,
and community profiles developed in appropriate communities. We hope
that a bibliographic profile will be taken on by NISO. It would
include:
- bib context set
- Related bibliographic mappings
- Semantics
- OpenURL
- mapping
- context set
- other scenarios
- OAI
- Holdings schema
- scenarios described above
- Record Metadata
- Record Update
Rebecca will investigate the possibility of NISO taking on this work.
The OASIS standardization and the profiling processes should proceed
in parallel with liaison activity. The profiling activity may result
in additional requirements for the protocol and these should be forwarded
to the OASIS TC.
XQuery and CQL
Discussion of accomodating XQuery within CQL, or vice versa, or profiling
XQuery, is deferred. If we discover XQuery functions, desired but
not supported by CQL, we should renew discussions.
Completing the 1.2 site
See draft
SRU 1.2. Following the meeting we hope to make this the official
SRU spec as soon as we can. (It will still need to undergo approval
at LC.) The URL for the draft site is http://www.loc.gov:8081/standards/sru/
and for the current (1.1) SRU, http://www.loc.gov/standards/sru/ so
the only difference is the :8081 port number (which just means it is
the test server at LC).
Several of the pages have not yet been written (for example
the introductions) and these will need to be written or the pointers
removed. Rob will write the introductions for SRW, CQL, and Zeerex.
Discussion of "Frequently Asked Questions" (FAQ). There
should be an "official" FAQ and an "unofficial" FAQ.
The official FAQ would be under control of the Ed. Board, maintained
by Rob. The unofficial FAQ would reside on the SRU wiki.
SRU Explain/zeeRex/Z39.92
Issues:
- The OASIS charter references Z39.92, and it is planned that the completed
standard will use Z39.92 for Explain. But Z39.92 was never finished. It
seems to have been abandoned by NISO along with the other metasearch
work. Rob and I will inquire about the status and if there is
any prospect for getting it finished.
- In fact, SRU 1.2 is supposed to reference Z39.92, according to What's
New in Version 1.2?. But there is nothing in the 1.2 spec to
reflect any such change. Rob's opinion is that Z39.92 is completely
compatible with SRU so no technical change is necessary (but he will
doublecheck) so we just need to make mention of Z39.92 somewhere.
- The terms "Explain" and "Zeerex" seem to be used
interchangeably but the have somewhat different meanings. Rob will
review all such terminology in the 1.2 spec.
OpenSearch, SRU and RSS
OpenSearch now has a website, openSearch.com, but it isn't clear who
is in charge of the spec since its founder, Dewitt Clinton, has gone
off to Google. It is still in the domain of A9.
Response Format
It was the consensus of the meeting that there should be a parameter
(in SRU version 2.0) to specify the requested response schema: SRU, RSS,
ATOM, ext.
Integrating SRU and OpenSearch
One strategy is to make OpenSearch requests legitimate SRU requests.
Then an SRU-friendly OS server will be able to do something intelligent
when it gets an SRU-loaded OS request.
SRU's 'startRecord' is the same as OS 'startIndex' and SRU's 'maximum
Records' is the same as OS 'count'. So an OS declaration:
<Url type="application/rss+xml"
template="http://example.com/?query={searchTerms}
&startRecord={startIndex}&
maximumRecords={count}&"/>
Will correctly describe a valid SRU query.
However, we need both:
- query={searchTerms} and
- query="{searchTerms}"
i.e. one with {searchTerms} quoted and one unquoted. So we need two
templates, the one above, and in addition:
<Url type="application/rss+xml"
template="http://example.com/?query="{searchTerms}"
&startRecord={startIndex}&
maximumRecords={count}&"/>
Proximity
All proximity issue, except the following, "proximity units",
are deferred for now and may be raised during the OASIS process.
Proximity Units
The March
2006 meeting report says:
Proximity units (other than in the cql set) should be treated
such that"unit" itself is a value in a context set, rather than the unit
value being a value in a context set.
For example suppose you want to define "street" as a proximity unit,
within context set 'xyz'. Do it like this:
prox/xyz.unit="street"
rather than this:
prox/unit=xyz.street
Proximity units 'word', 'sentence', 'paragraph', which are included in
the
cql set, will be explicitly undefined.
Nothing explicit got into the 1.2 spec to reflect this. Fortunately
there is nothing to prevent it either, this is supported by the bnf,
so adding prose to reflect does not constitute a substantive change.
'prox/xyz.unit="street" ' is preferable to ' prox/unit=xyz.street'
because whenever you have 'prox/unit=......' , 'unit' is a modifier from
the cql context set, so it's value would have to be one that is defined
in the cql context set. prox/unit=xyz.street matches a modifier from
one set with a value from another, not a good practice.
One might suggest adding all the units that people come up with to the
cql set. But we don't want to do that. This is a way to support
the definition of units in other context sets.
So we propose to add prose, in two places.
1. http://www.loc.gov:8081/standards/sru/specs/cql.html number
9, "Boolean
Modifiers":
Proximity units 'word', 'sentence', 'paragraph', and 'element'
are defined in the CQL context set, and may also be defined in other
context sets. Within the CQL set they are explicitly undefined. When
defined in another context set they may be assigned specific meaning.
Thus compare "prox/unit=word" with "prox/xyz.unit=word".
In the first, 'unit' is a prox modifier from the CQL set, and as such
its values are undefined, so 'word' is subject to interpretation by the
server. In the second, 'unit' is a prox modifier defined by the xyz context
set, which may assign the unit 'word' a specific meaning.
The context set xyz may define additional units, for example, 'street':
prox/xyz.unit="street"
Note that this approach, prox/xyz.unit="street", is preferable to 'Prox/unit=xyz.street'.
In the first case, 'unit' is a modifier define in the xyz context set,
and 'street' is a value defined for that modifier. In the second, 'unit'
is a modifier from the cql context set, with a value defined in a different
set. so it's value would have to be one that is defined in the cql context
set. Pairing a modifier from one set with a value from another is not
a good practice.
2. http://www.loc.gov:8081/standards/sru/resources/cql-context-set-v1-2.html under "PROX"
Similar prose.
Action Items
- Ray
- revise CQL Bibliographic Searching document done
- determine procedures to join the open list
- apply changes to record metadata document done
- rmd schema
- add prox prose to 1.2 spec done
- record metadata changes
- Rob
- rewrite " OAI Profile for SRU" section of report.
- introductions for SRW, CQL, and Zeerex.
- Zeerex
- Doublecheck that Z39.92 is completely compatible with
SRU
- Review use of terms "Explain" and "Zeerex"in the 1.2 spec.
- Rebecca
- investigate the possibility of NISO taking on bib profile.
- Rob and Ray
- inquire about the status of Z39.92 and if there is any prospect
for getting it finished.
- Rob and Dan
- Write up use case for SRU as an OpenURL Application
- Matthew
- Schema and wsdl bindings, for record update.
- Unassigned
- Determine what are the useful search points for OpenURL
|