SRU (Search/Retrieval Using URL)

SRU Record update

June 8, 2007

Namespaces - Request Structure - Response Structure - Elements - Actions (Create/Delete/Replace) - Diagnostics - examples - XML Files

This is version 1.0 of SRU Record Update


Background

The Record Update service allows for remote maintenance from and administration of records within a compliant database. It has a simple and extensible mechanism for this, a single request/response pair that allows for creation, replacing and deleting records and metadata about those records.

The need for such a protocol has been expressed by several groups, but may benefit many. In particular, it is required for datasets which are maintained by distributed collaboration and contribution such as union catalogues, local history databases, book review databases and so on. Going further, it also allows many clients to be created for one service rather than a very tightly linked client/server relationship.

Although the protocol is being developed under the SRU 'umbrella', there is no need to implement SRU. It would be perfectly feasible to implement Record Update in order to maintain a database served only via OAI, or only via a proprietary HTML interface. To contrast OAI and record update, OAI is a pull mechanism to update databases and is used generally for scheduled batch processing, while record update is a push mechanism intended for more interactive use.

Most simple update mechanisms available on the web have been designed for document updating with the assumption that a client posts a document to one or more databases or makes a document available for harvesting, and it is the client (source) that generally controls all maintenance of the document.  In contrast, the process of updating metadata in a centralised database has characteristics that argue for a specific updating mechanism.  These characteristics include changes that may occur to the metadata on entry to another database.  The system of the target database will typically try to match the metadata with a record already on file and then will merge the data, thus an insert is actually changed into a merge.  Often the system uses a profile for merging that allows some incoming fields to be rejected if a similar field already exists and allows some to overwrite existing fields.  Some fields will be changed on entry, for example by replacing authors and subjects with preferred forms from authority records.  Some fields may be automatically generated such as provenance or classification.  The receiving system will also typically validate incoming data, sometimes rejecting the whole record or just the invalid parts of it.  These last three cases, authority control, enrichment and validation, can occur to records being inserted that remain unmatched.  As no insert command is taken straight on “face value”, there is a special requirement for keeping records in two separate systems aligned.  The only sure way to guarantee that a record on one database is aligned and stays aligned with another is by exchanging record identifiers.   In addition, these identifiers allow a real time access into the database for enriched content.  

As a consequence, SRU update includes extensive diagnostics that are used to inform a submitting client of any changes that have been made on entry to another database and to convey the record identifier of the record on the target database.  It is envisaged that SRU update will be primarily employed in an interactive situation where immediate response to diagnostics is possible.  Alternatively SRU may be a real time background task with a receiving program capable of processing identifiers and diagnostics in update responses.


Operations

There are two operations in Record Update, the Update operation and the Explain operation. Update (below) contains a parameter which specifies the type of update action to perform (create, replace, or delete), and may be extended by profiles. Explain requests a service description record for the Update operation. The remainder of this document describes the Update operation.


Update Operation

Message Structure

 

        Namespaces

Messages use elements from three different namespaces. WSDL and Schema files are available for them.

Prefix Namespace URI Namespace Information Description

srw

//www.loc.gov/zing/srw/

 

The SRW schema

ucp

info:lc/xmlns/update-v1

Namespace Information Page

Additional elements for Update

diag

//www.loc.gov/zing/srw/diagnostic/

 

SRW Diagnostics schema

 

Request Structure

 

Name Type Required
srw:version xsd:string   Value="1.0" Mandatory
ucp:action xsd:string Mandatory
ucp:recordIdentifier xsd:string
Optional
ucp:recordVersions sequence Optional
   ucp:recordVersion Mandatory
      ucp:versionType xsd:string Mandatory
      ucp:versionValue xsd:string Mandatory
srw:record Optional
   srw:recordPacking xsd:string Mandatory
   srw:recordSchema xsd:string Mandatory
   srw:recordData srw:stringOrXmlFragment Mandatory
   srw:extraRecordData srw:extraDataType Optional
srw:extraRequestData srw:extraDataType Optional

 

Response Structure

 

Name Type Required
srw:version xsd:string Mandatory
ucp:operationStatus xsd:string Mandatory
ucp:recordIdentifier xsd:string Optional (see note)
ucp:recordVersions sequence Optional
   ucp:recordVersion Mandatory
      ucp:versionType xsd:string Mandatory
      ucp:versionValue xsd:string Mandatory
srw:record Optional
   srw:recordPacking xsd:string Mandatory
   srw:recordSchema xsd:string Mandatory
   srw:recordData srw:stringOrXmlFragment Mandatory
   srw:extraRecordData srw:extraDataType Optional
srw:diagnostics sequence Optional
   diag:diagnostic (see table) Mandatory
      diag:uri xsd:anyURI Mandatory
      diag:details xsd:string Optional
      diag:message xsd:string Optional
srw:extraResponseData srw:extraDataType Optional

Elements

version

The version parameter on both request and response has the same semantics as the version parameter from SRU.

action

The action parameter determines what action the server should take with the information provided. Actions defined by the base profile and their semantics are:

Identifier Description
info:srw/action/1/create Create a new record
info:srw/action/1/replace Wholly replace an existing record with a new record
info:srw/action/1/delete Delete an existing record

Further profiles may at their discretion create new identifiers for either new actions or extensions of the above actions with additional processing requirements. (This is the reason why URIs are used to identify the action, rather than just enumerating them by integer.)

So it is important to note that the URI 'info:srw/action/1/create' (for example) refers to the "create" action defined in this document.  Another authority could create its own create action with different semantics. For example, the 'info' authority 'info:srw/action/2' could define a create action with different semantics, with (for example) identifier 'info:srw/action/2/create' and the two create actions would be distinguished because their URIs are different.

operationStatus

The status of the operation is returned in this field. Defined values are:

Value Description
success The server has completed the operation successfully
fail The server could not complete the operation, additional information may be present in the diagnostics
partial Part of the operation was successful, additional information may be present in the diagnostics
delayed The server has not yet finished the operation

 

recordIdentifier

A record identifier is a unique way to distinguish a record within the current context. The string may be any means to determine identity of the record, including but not limited to identifier strings, references to a result set and a position within it or a query which will evaluate to a single record. The recommended solution is an identifier string.

Some servers may also support identifying records by sending them in the record parameter. If the recordIdentifier parameter is present, then the record parameter must not be used in this way.

The server may create a result set with the record and return it as an identifier. If so, the result set should last for a reasonable amount of time, depending on the context, to allow further references to it.

Note: The recordIdentifier parameter in the response is provided for the convenience of those using RecordUpdate with the SRU 1.1 record structure. When used with the SRU 1.2 record structure it is recommended to omit this parameter (because the SRU 1.2 record structure includes a record identifier).

recordVersions

RecordVersions is version information concerning the record. This is a means of tracking changes to a single record such that it maintains a persistent identifier throughout its existence, but the changes can still be tracked and referenced. A server may require that the most recent version of the record be supplied in a request to ensure that the operation is taking place on the most recent copy of the record.

The information is in the form of a list. Each recordVersion entry in the list is a pair consisting of a type and a value. Each type must be unique within the list. All entries must pertain to the same version. For example, if a checksum and a versionNumber are supplied, then the checksum must be that for the given versionNumber.

Identifier Description
versionNumber An incrementing number or combination of numbers
datestamp A datestamp for the time the record was last modified
checksum A checksum for the record

 

record

The record structure contains the actual record data to be used as part of the operation. It has the same structure and semantics as the SRU record structure.

Profiles and actions may require the presence of the record in either request or response for different actions.

Update also allows for a third type of recordPacking: 'url'. If the value of recordPacking in a request is URL, then the value of recordData is a string containing a URL to the record to be operated on. The expected use for this is to allow for clients to send a reference to a large record, possibly on an alternate site, and for the server to collect it at its leisure.


Actions

Create Action

 

Request

RecordIdentifier may be present, but must not be a resultSetReference. If present, it is a request that the server use the given identifier for the new record.

RecordVersions may be present, but may only be a 'number'. If present, it is a request that the server use the given version number for the first version of the new record.

Record may be present. If present, it must contain the record to be created. If not present, it is a request for the server to create an empty placeholder record and return a reference to it for later editing.

Response

RecordIdentifier may be present. When the SRU 1.1 record structure is used, it is recommended that recordIdentifier be present, either as a value or a result set reference.  When the SRU 1.1 record structure is not used, it is recommended that recordIdentifier not be present.

RecordVersions may be present.

Record may be present, and it is recommended that it be present if the server has transformed the record in any way.

Replace Action

 

Request

RecordIdentifier must be present. It identifies the record to be replaced.

RecordVersions may be present. If present, they further identify the record to be replaced.

Record must be present.

Identifiers are crucial for record replace actions and sometimes it is necessary to disambiguate a replace request using the edit replace structure.  A metadata record describing a resource on one database may differ substantially from a record for the same resource on another database; when a record replace is sent, it is desirable to indicate unambiguously the nature of the changes being made.  This is important so that parts of the record are not inadvertently deleted because they did not appear in the replacement record.   Therefore SRU update includes an edit replace structure that can be used in extraRequestData of the request for unambiguously stating the intentions of a request. 

Element Type Occurrence

dataIdentifier

xsd:string

optional (all types)

non-repeatable

oldValue

xsd:string

Mandatory where editReplaceType is R or D.

Omitted for I.

Not repeatable.

newValue

xsd:string

Mandatory where editReplaceType is I, Omitted for D.

Optional for R.

Not repeatable.

editReplaceType

xsd:string, values:

  • I=insert
  • D=Delete
  • R=Replace

mandatory.

Not repeatable.

In the absence of a dataIdentifier or where the dataIdentifier is ambiguous, the default is to change or delete all occurrences of the specified data which match the specified old value.

Examples:

<editReplace>
<dataIdentifier>650 / 002 : z/001</dataIdentifier>
<oldValue>New Hebrides</oldValue>
<newValue>Vanuatu</newValue>
<editReplaceType>R</editReplaceType>
</editReplace>

In the second field 650, replace the first occurrence of subfield z to read “Vanuatu”  where it was “New Hebrides”

<editReplace>
<dataIdentifier>Holdings</dataIdentifier>
<newValue>TU</newValue>
<editReplaceType>I</editReplaceType>
</editReplace>

In the holdings section of this record, add the institution symbol “TU”

<editReplace>
<dataIdentifier>Holdings</dataIdentifier>
<oldValue>TU</oldValue>
<editReplaceType>D</editReplaceType>
</editReplace>

In the holdings section of this record, delete the institution symbol “TU”

Response

RecordIdentifier may be present. When the SRU 1.1 record structure is used, it is recommended that recordIdentifier be present; it identifies the new record. When the SRU 1.1 record structure is not used, it is recommended that recordIdentifier not be present.

RecordVersions may be present. If present, it identifies the new record.

Record may be present. It is recommended that it be present if the server has transformed the received record in any way.

Delete Action

 

Request

The Delete action should only be used when deleting an entire record.  When deleting a field, section or defined part of a record, the record replace action should be used, using as necessary the edit replace structure in extraRequestData.

It is recommended that recordIdentifier be present. If present, it identifies the record to be deleted. If it is not present, then Record must be present.

RecordVersions may be present. If present, they further identify the record to be deleted.

Record and recordIdentifier may both be present for redundancy. However, the server is not obliged to cross-check, that is, it may ignore one or the other, but if so should supply a diagnostic. See diagnostics 63 and 64.

Response

RecordIdentifier may be present. If present, it identifies the deleted record. When the SRU 1.1 record structure is not used, and if Record is present, it is recommended that recordIdentifier not be present.

RecordVersions may be present. If present, they further identify the deleted record.

Record may be present. If present, it is the record which was deleted.


Authentication

Although authentication and authorisation are an important aspect of practically every update system, the messages themselves do not carry any such information in the base profile. This is for two reasons:

1. The protocol cannot predict all of the business logic requirements that an authentication system might require
2. There are already mechanisms available for authenticating to a SOAP service, such as basic authentication in the HTTP layer

It is recommended that SRU's authenticationToken system be used if there are no other requirements. This system uses a token to identify the user, but does not specify how that token is initially obtained.

 


Record Schema

Although Update does not specify a required record schema, nor does it assume that the records are maintained in any particular schema, records must be sent somehow. The schemas which the system will accept are recorded in the service description record, which is sent in the ExplainResponse message. The server will be prepared to accept any record which will validate against the schemas listed. If there is a preferred schema, it will be noted as the default schema in the configInfo section.

If a server accepts more than one record schema, the server will either transform the record into a native schema or save it as sent. Schemas other than those listed in the service description may be stored as sent or rejected.


Concurrent Editing

The protocol does not make any attempt to dictate how the situation of multiple people editing the same record is to be handled by the server, as different usage scenarios will require different solutions.

Recommended solutions include:

  • Use the recordVersions parameter to ensure that the change is against the most recent version of the record. Accept it if it is, otherwise it will be rejected as someone else has edited the record in the mean time.
  • Include a 'locked' status in the record's metadata and have users then set this flag while they are editing the record, and unlock it after they submit their changes.
  • Use a system which tracks changes and accept the change if it does not conflict with other changes to the record.

Background Processing

While it is generally expected that the UpdateResponse message will be sent after all processing has been completed, this may not be feasible for large databases. Either the change to the record or any subsequent processing may be delayed such that the response cannot say whether the operation was a success or a failure, as it hasn't yet been completed.

In this case, the server should return an operationStatus of 'delayed'. The protocol does not specify a mechanism to identify the operation in a future request to discover if it has been completed or not, but solutions might include:

  • Include a field in extraResponseData or a diagnostic which gives the operation an identifier to be used in a future request. Such a request is out of scope.
  • Simply query the database to see if the change has taken effect.

Diagnostics

The diagnostics below are defined for use with the namespace info:srw/diagnostic/12

The number in the first column identifies the specific diagnostic within this namespace. Thus for example diagnostic 2 below (in "Msg id" column): "Invalid component:  component rejected", is identified by the uri: info:srw/diagnostic/12/2.

Msg id

Description / Action taken

Other Information

Validity

1

Invalid component:  record rejected

component id, invalid data

2

Invalid component:  component rejected

component id, invalid data

3

Invalid component:  warning only

component id, invalid data

4

Invalid component:  default value applied

component id, invalid data

5

Invalid component:  data corrected by server

component id, invalid data

6

Invalid repetition of component: record rejected

component id, invalid data

7

Invalid repetition of component: component rejected

component id, invalid data

8

Invalid repetition of component: warning only

component id, invalid data

9

Missing mandatory element: record rejected

component id, invalid data

10

Missing mandatory element: warning only

component id, invalid data

11

Missing mandatory element: default value applied

component id, invalid data

12

Invalid data structure: record rejected

component id, invalid data

13

Invalid data structure: component rejected

component id, invalid data

14

Invalid data structure: warning only

component id, invalid data

15

Invalid data structure: default value applied

component id, invalid data

16

Invalid data structure: data corrected by server

component id, invalid data

17

Incorrect element length: record rejected

component id, element name, data

18

Incorrect element length : component rejected

component id, element name, data

19

Incorrect element length : warning only

component id, element name, data

20

Incorrect element length : default value applied

component id, element name, data

21

Incorrect element length : data corrected by server

component id, element name, data

22

Invalid record identifier : record rejected

record identifier

23

Invalid record identifier : warning only

record identifier

24

Invalid record identifier : default applied

record identifier

25

Invalid record identifier : data corrected by server

record identifier

26

Invalid component identifier: record rejected

record identifier, component identifier

27

Invalid component identifier: component rejected

record identifier, component identifier

28

Invalid component identifier: warning only

record identifier, component identifier

29

Invalid component identifier: data corrected by server

record identifier, component identifier

30

Record schema unacceptable: record rejected               

record identifier, component identifier

31

Record schema unacceptable: component rejected      

record identifier, component identifier

32

Record schema unrecognised: warning only  record id

record identifier, component identifier

33

Record schema unacceptable: record converted

record identifier, component identifier, identifier of schema applied

Update

50

Record not found (replacement or delete)

record id

51

Component not found (replacement or delete)

record id, component data

52

Record protected or locked by another user

record id

53

Cannot delete or replace record or component, authorization failure

record id / component data

54

Cannot delete or replace record or component without locking first

record id / component data

55

Cannot process update, incorrect or invalid version

record id; latest version #, URL for retrieving latest version of record

56

Linked records exist, cannot delete record

record id

57

Record or component not found, replacement request processed as an insert

record id of new record

58

Suspect duplicate: record or component insert rejected

record id of incoming record, record id of database record

59

Suspect duplicate: warning only

record id of incoming record, record id of database record

60

Incoming record matches with database record, records merged

record id of database record

61

Unspecified database error

record id ++

62

Cannot process or store record, insufficient space

record id

63

Both 'recordId and 'record' were included on a 'delete' action. 'record' is ignored. Warning only.

 

64

Both 'recordId and 'record' were included on a 'delete' action. 'recordIds' is ignored. Warning only.

 
65

Not processed (replace or delete). Record identifier retrieved more than one record.

 

Examples