4A Protocol Specification
- Protocol for communication between client and server
- Protocol for exporting annotations to the external databases
4A Annotation Format
- Example of structured annotation

4A Protocol Specification

4A Framework now contains protocol for communication between client and server and new XML format of annotation. In the future, framework will be extended and new protocol for communication between 4A servers will be created.

Protocol for communication between client and server version 1.1

Protocol for communication between client and server have ten parts, which description follows below. In each part some messages was involved. These messages can be combined into message bundle, which will be send to the server or to the client. All messages sent in one bundle are contained in element <messages>, which is document element of XML with messages. XML then can be send over any protocol on lower level.

In case of transport over HTTP, communication pass over two channels: AJAX and comet. After connection was established through first (AJAX) channel, client automatically establishes connection over second (comet) channel. Now client sends data over AJAX channel and server responses on both channels (for each information proper channel is used). Comet channel is used only for sending informations from server to client and these informations are send asynchronously. Responses to information requests, error messages and other informations generated in response to client message are send as responses over AJAX channel. Informations which are broadcasted to all interested clients eg. new annotations, types of annotations and changes of document are send over comet channel. For simplifying of clients, newly added annotations are not send back over POST channel, but there are send only over comet channel (client listen for it only on one channel and display all changes by same way). Server is receiving request from both channels on one address (if server was written in Java, it will be one servlet) and distinguishes between them according to message content. If only session was on the bundle, it's comet request and will be suspended while some data will be available. If other content is bundled, it's AJAX request and response will be send immediately.

Protocol version 1.1. have some minor improvements against 1.0, eg. multiple inheritance of types of annotations, comments of types of annotations and its attributes etc. Protocol version 1.0 should not be used, so its specification is not available.

Session management

Session management contains protocol version negotiation, log in of user and log out of user.

Firstly client starts connection by sending of <connect> message, where highest supported version of protocol will be put to attribute:

<connect protocolVersion="1.1"/>

Server replies with error message or with following message:

<connected protocolVersion="1.1" sessionID=""/>

Server supply used version of communication protocol in attribute protocolVersion . Server should communicate with same protocol version as client or with lowest version, which is backward compatible with client version. If client offer newest version than server, server uses newest version supported. If client detects that its version is not backward compatible with server version, it must switch to server version or any backward compatible version. If no compatible version available, client must disconnect itself. If protocol version offered by client is not supported by server, server replies with error message and client can try connect with lower version of protocol.

With respect to that newer version of protocol can be backward compatible, client and server can implement various versions of protocol. If server or client have new functionality, which is unknown to adverse party, adverse party with older version of protocol will not use it and will ignore unknown elements and attributes. If newer version of protocol will not be backward compatible, server or client which implements it must reject connection with given combination of protocol versions.

End of connection is signalled by following message:

<disconnect/>

Server sends id of session in attribute sessionID of connected message. From this moment client will be sending id of session with all messages in attribute id of element session:

<session id=""/>

<login user="" password=""/>,

where user is user name or email and password is user password. Alternatively external authentication can be used, so following message will be used:

<login user="" token="" system=""/>,

where user is user name, token is authentication token and system is URL of the system which provides authentication of the user.
In case of successfull log in server replies with list of parameters of user's settings and with following message:

<logged id="" name=""/>,

where id is identificator of the user and name his displayed name. In case of log in failure, server replies with error message. Log out is performed by following message: <logout/> . Start of connection with protocol version negotiation and log in can be done simultaneously. Log out and end of connection can be done simultaneously too.

Users and user groups

For getting informations about user's profiles client can send message:

<queryPersons filter="" withGroups=""/>

For selecting field in the filter client can use keyword id , email or name followed by colon. In case of filtering on multiple fields, individual fields must be separated by semicolon. In case of optional attribute withGroups with value true will be presented, informations about groups in which is user member will be contained in informations about user profiles. Server replies with following:

<persons>
  <person id="" login="" name="" email="" photoURI=""/>
</persons>

Attribute name contains full name of the user, attribute photoURI may contains URI of photo of user, which can be placed along with user. If informations about user groups was requested, each tag person will be containing tag userGroups (see below). Tag person can also contain further informations about user's profile in additional tags and text content.

For getting informations about user groups client can send following message:

<queryUserGroups filter="" withPersons=""/>

Client can use URI of group or its name in the filter. In case of optional attribute withPersons with value true will be presented, informations about users in the group will be contained in informations about user groups. Wildcard characters asterisk (any number of any characters) can be used in the name of group. Server replies with following:

<userGroups>
  <group uri="" name=""/>
</userGroups>

If informations about members of groups was requested, tag persons (see above) will be included in each tag group .

For join the user to user group, client can send following message:

<join group=""/>

where attribute group contains URI of user group. For leaving of user group, client can send following message:

<leave group=""/>

Subscriptions to annotation taking

Client can take only annotations of selected types from selected sources. Source can be user or URI, which can identify user group, 4A server or another generic source.

Client subscribes itself to taking of annotations by sending of following message:

<subscribe>
  <source type="" user=""/>
  <source type="" uri=""/>
  <source type=""/>
  <source user=""/>
  <source uri=""/>
</subscribe>

There can be any nonzero number of source elements and they can have combinations of attributes which are listed above. Attribute user identifies user, attribute uri identifies user group or any generic source of annotations. Attribute type identifies type of annotation and all his subtypes. Wildcard characters asterisk (any number of any characters) can be used in type.

If type was not presented, all types of annotations from given source (with respect to user groups in which is user member) will be taken. If source is not presented, all annotations of given type will be taken.

For unsubscribe from given source or types of annotations, unsubscribe message can be send:

<unsubscribe>
  <source type="" user=""/>
  <source type="" uri=""/>
  <source type=""/>
  <source user=""/>
  <source uri=""/>
</unsubscribe>

Same rules applies for unsubscribe as for subscribe (any number of elements source etc.). Client is automatically subscribed to taking all user's own annotations and all annotations from all user's groups. Client can unsubscribe from default sources of course.

Synchronization of document

Synchronization of document is a process in which server obtain a copy of actual version of annotated document. In case that server get this document first time, it stores them to database and returns URI of stored version to client. Client use URI from server in all created annotations. In case that server already have copy of this document stored, it compares both versions and if they have same content, sends URI of stored version to client. If contents of documents are almost same, server analyse differences and if existing annotations will be affected by update of stored version of document. If no annotations will be affected, document will be actualized transparently (for client it is same as match of document contents). If annotations will be affected by update, server sends client warning, that some annotations was updated and finishes the synchronization. If a lot of annotations will be fundamentally changed or superseded, synchronization error occurs and user must decide whether finish synchronization and supersede some or all annotations (move to the level of whole document) or not finish and find and annotate stored (probably old) version of the document or another document.

If more users are working with document, it is possible to those users have newer version of the document which is not stored on the system in which annotation client was displayed (I can see version which was used by already connected users in the time of the start their work). In this case synchronization error appears too. For this reason synchronization error contains the content of the document which is stored on annotation server. So the user can see own version and server's version in synchronization dialog and can select from them or create new one. After synchronization all connected user's versions will be updated.

Since client can be also simple text editor, which is not working with structured text, server should support linearization of document too. In this case client linearize document to plain text and send it to server in linearized form. If server have structured form of document, server linearizes stored version and compares them with version from client. If both versions are same, synchronization will be finished and server will be adjusting all of newly send annotations for structured version of the document. If linearized forms of document will differ, synchronization error occurs.

Syntax:

<synchronize resource="http://example.com/documents/doc1.txt" linearize="false" overwrite="false">
  <content>
    <![CDATA[
      ...
    ]]>
  </content>
</synchronize>

Client send source URI (address from which document originates) and copy of content of annotated document. Attribute resource contains the URI, element content contains content of the document (probably in CDATA section).

Optional attribute linearized denotes, whether client works with linearized form of the document. Default value is false.

Optional attribute overwrite makes it possible to force synchronization in case of different content of the stored version of document (copy on the server) and actual state of document. Server overwrites document and invalidate (delete or move to level of whole document and add informations about change to text content) all annotations affected in this case. Client should not use this attribute in first synchronization attempt. Use of this attribute must be confirmed by user. If contents are same, server ignores this attribute. Default value is false.

In case of successful synchronization, server replies with following message:

<synchronized resource=""/>

Attribute resource contains URI of stored copy of annotated document, which is on the server. Client must use this URI in annotations.

If in work process such situation occurs, that content of annotated fragment do not match with content, which server find on same position, synchronization will be lost. Server sends error message and message <resynchronize/> (element without attributes and content) in this case. Resynchronization will be requested thus.

Client do resynchronization by sending following message:

<resynchronize>
  <content>
    <![CDATA[
      ...
    ]]>
  </content>
</resynchronize>

Reloading of all annotations is necessary after resynchronization, so server sends all annotations in reply to resynchronize message.

If client is doing some modifications of document, it must send each change to server:

<textModification path="" offset="" length="">
  <![CDATA[
    New content of the selected fragment ...
  ]]>
</textModification>

Attribute path contains XPath of document DOM node, in which change was made. Attributes offset and length contains offset and length of changed fragment. If offset and length are empty whole DOM node is selected. New content of the fragment is linearized (with tags) in the body of element <textModification/> If new fragment was inserted, length of old fragment is zero. If fragment was deleted, element with new content is empty. If we are working with linearized form of document, XPath will be empty.

If we selected other type of DOM node than text node (element node), modification is done on the node value so for "<p>content</p>" offset of "content" is 3. If we selected attribute node, offset of the value is given by attribute name and '="' so for "<a href="http://">" the offset of "http://" is 6.

Server distribute textModification message to all other clients with same document immediately.

Transmission of types of annotations

Transmission of types of annotations runs bidirectional. If new type was added, changed or deleted, client sends this change to server immediately and server distributes it to all other clients which are interested. But client needn't keep whole tree of types of annotations in its memory. In case that client have only part of tree, server can send it all the changes or can hold informations about part of tree on the client. In this case server can send to client only informations about changes in this part. Server must keep whole tree of types of annotations of given user group always (all changes are send to him).

Client can requests for part of the tree with types of annotations by the following message:

<queryTypes filter=""/>

Attribute filter makes it possible to obtain only selected subtree of the tree with types of annotations. It contains linearized name of type of annotation (path in the tree, where nodes are separated by "->") or URI of type of annotation with wildcard characters asterisk (any number of any characters). So filter can select the subtree or set of types which linearized name or URI contains given text.

If client requests for tree of types of annotations, server replies with message with adding of types of annotations (see below). If no types are selected by filter, server sends empty list of types to add.

Transmission of types of annotations is realized by following message:

<types>
  <add>
    <type name="" ancestor="" uri="" group="" restrictedAttributes="">
      <attribute name="" type="" required=""/>
      <attribute name="" type="" required="">
        <comment></comment>
      </attribute>
      <ancestor uri=""/>
      <comment></comment>
    </type>
  </add>
  <change/>
  <remove/>
</types>

Element types contains:

element add, in case of type was added,
element change, in case of type was changed,
element remove, in case of type was removed (deleted).

In each of three elements above can be any number of elements type. Attributes of this element are name (name of type), ancestor (URI of primary ancestor type), uri (URI of type) a group (URI of user group to which type belongs). If URI of primary ancestor is empty, it is a basic type (children of invisible root of the tree). URI identifies type unambiguously. If group is not included, it will be specified by group of primary ancestor type or by default group of user which added this type.

Each type can have default attributes. They are contained in elements attribute. Each attribute have name (name) and type (type). If it is the attribute of simple data type, there is name of this type. If it is the attribute of type of annotation (nested annotation or link to annotation), there is URI of desired type of annotation (information about nesting or linking is not needed here, because both variants have same meaning here). If attribute is required (value must be supplied), element attribute have attribute required with value true.

Simple data types of attributes are following:

String - any string
URI - URI
DateTime - date and time according to RFC 3339 (iso-date-time with datespec-full)
Date - date according to RFC 3339 (datespec-full)
Time - time according to RFC 3339 (time)
Integer - integer numberinteger number
Decimal - decimal number
Boolean - boolean (true or false)
GeoPoint - geographic point according to Basic Geo (WGS84 lat/long) Vocabulary (basic)
AnyAnnotation - any nested annotation or annotation link (it can't be type of attribute of annotation - there must be concrete type selected during annotation creation)
Person - user of 4A annotation server identified by URI or e-mail. This type is deprecated and may be removed in future versions.
Duration - duration according to RFC 3339 (duration)
Binary - binary data (intended for save an any file with annotation, eg. in OpenDocument format) - data will be encoded in base64 (file size can be limited in some server implementations, eg to 2MB).
Text - long text data
Image - URI of image (similar as URI but displayed differently - preview with link to full size image)
Entity - Entity from controlled vocabulary

URI of type contains path to root of tree of types of annotations. Individual path of this path is assembled from names of primary ancestors. Linearized name is composed from names of primary ancestors too. But type of annotation can have more ancestors. All ancestors are involved in a list of ancestors, which is specified by elements ancestor in body of element type. All ancestors can be used for finding the type of annotation by user (type will be displayed under the all ancestors in the tree, can be found in autocomplete using all ancestors etc.), but only primary ancestor is used in URI and server have no way to find out how user found used type. So only URI of type is stored in the annotation and only linearized name from primary ancestor names can be displayed along with annotation (all other ancestors can be displayed along with primary one, but it's not mandatory). Attribute uri of element ancestorcontains URI of ancestor type of annotation.

Type of annotation can have comment for clarifying of its meaning. Attribute of type of annotation can have comment too. In both cases comment is contained in a body of element comment, which is contained in the element to which comment applies. No more elements commentcan be in same element type or attribute. CDATA section can be used to wrap content of comment (it's recommended).

Type of annotation can have restricted attributes (value true of optional attribute restrictedAttributes). In this case no attributes can be added and types of attributes can't be changed. But if type of attribute is type of annotation, still can be used nested annotation or link to annotation there.

Name of type can't be changed other way than by delete (remove) of old type and create a new one.

If user is adding a new attribute to the type of annotation, he must fill in the attribute name and select type of attribute value. If types of annotations are imported from ontology, there will be attributes, which have name and type of value, but there is no clear to which type can be assigned. So there is so called "attributes from ontology", which can be used as alternative way to give name and type of value of attribute. User then can select between filling in name and type of value and between selecting from the list of attributes imported from ontology.

Client can request for attributes from ontology by the following message:

<queryAttrFromOnto group=""/>

Where attribute group contains URI of user group to which requested attributes belongs or asterisk for getting of attributes from ontology from all user groups.

Server responds with following message:

<attrsFromOntology>
  <attribute name="" type="" group=""/>
  <attribute name="" type="" group="">
    <comment></comment>
  </attribute>
</attrsFromOntology>

Element attrsFromOntology can contain any number of elements attribute with individual attributes. Each attribute have name (name), type (type) and URI of user group to which belongs (group). If it is the attribute of simple data type, there is name of this type. If it is the attribute of type of annotation (nested annotation or link to annotation), there is URI of desired type of annotation (information about nesting or linking is not needed here, because both variants have same meaning here). Element attrsFromOntology can contain any number of elements attribute with individual attributes.

Transmission of annotations

Transmission of annotations is performed similarly as transmission of types of annotations:

<annotations>
  <add>
    <annotation/>
  </add>
  <change/>
  <remove/>
</annotations>

Each adding, change or removing of annotation are send to server immediately. Element annotations contains elements add (added annotations) change (changed annotations) and remove (removed annotations) according to action performed. Each of these elements can contain any number of elements annotation.

Server each change distributes to other interested (those which are subscribed to taking of annotations to which this annotation belongs) clients immediately. If annotation was added, it is send back to originating client too (by comet channel), so it can obtain assigned id of newly added annotation.

All annotations assigned to the document (and from sources and of types defined by client subscriptions) are send to client as added automatically after synchronization or resynchronization of the document.

If client needs to reload some annotation (eg. after unsuccessfull attempt to editation) or all annotations, he can send one of following messages to server:

<reload uri="http://example.com/annotations/123456"/>
<reload all="true"/>

Attribute uri in first variant of message contains URI of requested annotation. Attribute all in second variant states that all annotations should be sent again.

Suggesting of annotations

Server can suggest automatically generated annotations of the document or his part to client. Client requests for suggestions by following message:

<suggestAnnotations path="" offset="" length="" type=""/>

Attributes path , offset and length contains path (XPath), offset and length of fragment of document to which annotations should be suggested. If only XPath is contained, annotations will be suggested for whole DOM node. If no one from this attributes contained, annotations will be suggested for whole document. Optional attribute type can contain requested type of suggested annotations. All subtypes of requested type will be suggested too.

One path, offset and length can point to one DOM node or part of his content. If client needs to get suggestions for multiple nodes (eg. three paragraphs), it must send multiple elements suggestAnnotations. If more then one element was send, server can get type of annotation from random element (all elements should have same value of attribute type). If client would like to change list of DOM nodes to which annotations should be suggested, it must send whole new list of elements suggestAnnotations. If new element or elements was sent, actual list of fragments on the server will be replaced by new one. So if more than one suggestAnnotations is send, it's a list of fragments and not more requests for more types of annotations! This can be changed in future versions of protocol, but it increases a complexity of the server.

Server replies with annotation suggestion:

<suggestions>
  <annotation tmpId="" confidence=""/>
</suggestions>

Element suggestions can contain any number of elements with annotations ( annotation). Attribute confidence contains guessed level of confidence of annotation in percents. Value can be used by client for automatic confirmation or refusing of annotations. Annotations in suggestions does not have persistent identificator, but it have temporary identifier (attribute tmpId of element annotation). Temporary identifier can be used for giving feedback about manipulation with suggested annotations to the server. It can be used to send information about removing of suggestion from offer in case of actualization of suggestion (simpler solution, because client needn't update functionality for suggestions) too.

tmpId can be used for creating links between suggestions too. In this case attribute of suggested annotation can look like this: <a:attribute name="reason" type="annotationLink" tmpId="1234567"/> All linked suggestions must be confirmed with annotation in which it is used so if I confirm the annotation, I must confirm all annotations linked from its attributes and so on. I can link the suggestion to edited existing annotation but If I save changes, I must confirm linked suggestion with it.

If client want to confirm the suggested annotation (after action of user or after automatic decision based on users settings), client send it to the server as any other added annotation (in annotations message), but element annotation will contain attributes confirmed and tmpId. Atribute tmpId contains temporary identifier of suggested annotation and attribute confirmed contains method of annotation confirmation. Methods of confirmation are:

manually - annotation confirmed by user,
manuallyEdited - user has edited the suggested annotation and then he was saved it,
automatically - automatically confirmed suggestion (according to configuration of the client),
automaticallyEdited - automatically confirmed suggestion which was automatically edited (adapted).

In case of document content change, update of list of suggested annotations may be needed. Server sends update immediately in this case. For simplifying of implementation of the clients, there is no update message. If any suggestion was changed, it will be removed and new suggestion will be send. So any number of elements delete can be contained in the element suggestions:

<suggestions>
  <annotation tmpId="" confidence=""/>
  <delete tmpId=""/>
</suggestions>

If client have not annotation with given temporary identifier, element delete will be ignored. If user (or client according to settings) refuses one of the suggestions, client send following message to server:

<refusedSuggestions>
  <suggestion tmpId="" method=""/>
</refusedSuggestions>

Attribute method contains method of refusing. It can be one of the following methods:

manually - refused by the user,
automatically - automatically refused according to settings of the client.

If client does not want obtain further suggestions of annotations, it can request for suggestions for a fragment with zero length. This element should be on the start of document content.

Controlled vocabulary support

The controlled vocabulary is a feature which allows the annotation client to search in the dictionary entities. Server processes the answers to queries and send it back to the client as a message.

The annotation client sends queries to dictionary entities by a queryEntities message. This message has two parameters from which one is required (filter) and one is optional. First paremeter is type. Type indicates in which group the server will search entities (eg. Artwork, Location or Artist). If parameter type isn't completed, annotation server searches through all types. Annotation client can restrict results by using a filter attribute. There is example of request:

<queryEntities type="Artwork" filter="Sunflowers"/>

<queryEntities type="" filter="Sunflowers"/>

The server responds asynchronously on this query with a <entity> message (This means that server can not guarantee the arrival of a response in the order they were sent requests). This message will contains all entities that server will find. Founded entities are returned in this message:

<entity name="" uri= "" type=""  visualRepresentation= "">
     <![CDATA[ ... ]]>
</entity>

Where parameter name contains name of entity, parameter uri contains addres of entity, parameter type indicates group of entities (that is equal with group in query if was filled) and visualRepresentation parameter that contains uri of image that represents the entity. Entity tag contains description of entity in CDATA block. The name and uri are required parameters, others are optional. There are example of results to the query above:

<entities>
    <entity name="Sunflowers from Arles" uri="http://www.artnet.com/artwork/426018191/3952/sergei-chepik-sunflowers-from-arles.html" type="Artwork" visualRepresentation="http://images.artnet.com/artwork_images/220/547806.jpg">
     <![CDATA[Sunflowers from Arles by Sergei Chepik is available at
     Catto Gallery.]]>
    </entity>
    <entity name="Sunflowers in a vase with doll, book and box beside them" uri="http://artsalesindex.artinfo.com/asi/lots/1515253" type="Artwork"/>
</entities>

Unless the criteria in the query did not match any entity, server sends a blank <entities> message:

<entities>
</entities>

Selected entity is added as a entity attribute of annotation, after user select it (this makes the annotation client). The parameters of attribute have the same meaning as parameters of query results. The entity attribute looks like this:

<attribute type="entity" name="">
    <entity name="" uri="" type="" visualRepresentation="">
         <![CDATA[ ... ]]>
    </entity>
</attribute>

Transmission of settings

Settings is a list of items which have name and string value. Settings can be divided to the settings of the client and to the settings of the server. Settings of the client will have prefix " Client" (eg. "ClientAnnotationTypeColor:Animal->People->Employee" with value " green"). During displaying to user some (known) settings will be processed and displayed in the form and others will be displayed in a table for other settings in which user can change it.

Concrete items of settings are depend on implementation of the server and the client. So to avoid of conflicts in case of using multiple clients by one user, it's recommended to give a such names to the items, that have prefix with type and name of the client (eg. "ClientFFExtAnnotExt" will be the prefix for Annotation Extension for Mozilla Firefox). Or it's possible to use names which semantics is clear totally (eg. "ClientDefaultAnnotationType" for default type of annotation).

With respect to a few parameters of settings is expected and low frequency of transmissions, a whole list of settings is send in all cases. If parameter is not contained, it will be removed (if it's possible - some parameters can be required by server). If parameter is required and is not contained, it will be set to default value.

Settings is transferred in following message:

<settings>
  <param name="" value=""/>
</settings>

There can be any number of elements param, which contains individual parameter of settings. Attribute name name contains name of parameter and attribute value contains string value of parameter.

Errors and warnings

Error messages are intended to inform client about errors. Error message contains number of error (attribute number) and its text content (in element message). Element error can contain complemental informations about the concrete error. In case of error of privileges to access the annotation, there will be information about resources for which user haven't permissions. In case of unsuccessfull operation with existing annotation (editation or removing), there will be information about the annotation to which this apply (which annotation must be reloaded). In case of problem with attributes, information about concrete attributes must be included. Similarly in other error messages.

Text content of error messages will be localized according to settings given by parameter " ServerLanguage", which can have values according to ISO 639-2 (for bibliographic purposes - designated as "B"). If parameter was not set, default language which is english will be used.

Syntax:

<error number="1">
  <message>
    <![CDATA[
      Bad login or password.
    ]]>
  </message>
</error>

<error number="2">
  <accessDenied user=""/>
  <accessDenied uri=""/>
  <accessDenied type=""/>
  <accessDenied type="" user=""/>
  <accessDenied type="" uri=""/>
  <message>
    <![CDATA[
      Permission to selected annotation denied.
    ]]>
  </message>
</error>
<error number="3">
  <message>
    <![CDATA[
      Read only access - annotation not saved.
    ]]>
  </message>
</error>
<error number="4">
  <reload uri="http://example.com/annotations/123456"/>
  <message>
    <![CDATA[
      Editation not permitted.
    ]]>
  </message>
</error>
<error number="5">
  <reload uri="http://example.com/annotations/123456"/>
  <message>
    <![CDATA[
      Deleting not permitted.
    ]]>
  </message>
</error>
<error number="6">
  <reload uri="http://example.com/annotations/123456"/>
  <message>
    <![CDATA[
      Missing mandatory attributes.
    ]]>
  </message>
  <attribute name="" type=""/>
  <attribute name="" type=""/>
</error>
<error number="7">
  <reload uri="http://example.com/annotations/123456"/>
  <message>
    <![CDATA[
      Bad attribute value.
    ]]>
  </message>
  <attribute name="" type=""/>
</error>
<error number="62">
  <message>
    <![CDATA[
      Another client is holding a different version of this document.
    ]]>
  </message>
  <content>
    <![CDATA[<html><head></head><body><p>John Millington Synge travelled to the Aran Islands.</p></body></html>]]>
  </content>
</error>

The numbers of errors are following:

Unsupported protocol version.
Bad login or password.
Permission to selected annotation denied.
Read only access - annotation not saved.
Editation not permitted.
Deleting not permitted.
Missing mandatory attributes.
Bad attribute value.
Bad selection of fragment - suggestions is not possible.
Synchronization failed - there is registered different content with this URI.
Synchronization is not possible.
Synchronization error.
Editation of this annotation type is not permitted.
Deleting of this annotation type is not permitted.
Addition of annotation types is not permitted.
Type of attribute not exists.
Added annotation type is malformed.
Added annotation type attributes are malformed - some attributes omitted.
Edited annotation type is malformed - changes not saved.
Type of annotation does not exists.
Modification of annotation type name, ancestor or group is not possible.
Error in settings - not saved.
Synchronization message without resource address.
Synchronization message without document content.
Bad source in annotation.
Bad annotated fragment. Fragment omitted.
Bad attribute in annotation. Attribute omitted.
Bad value of extended type of attribute (unknown person in attribute etc.).
Bad confirmation method or temporal annotation id.
Edited annotation not found. Changes not saved.
Removed annotation not found. Annotation not deleted.
Session expired or bad session number.
Bad request. Possible client error or incompatible protocol.
Error in server module.
Requested annotation not found.
Bad XPath expression in annotated fragment. Fragment omitted.
Error in annotated document.
Bad fragment offset or length in annotated fragment. Fragment omitted.
Errors in edited annotation. Changes not saved.
Text modification is not applicable.
Unknown annotation type - suggestions is not possible.
Bad date format in annotation. Date have been set to actual date.
Bad date format in attribute. Attribute omitted.
Document not synchronized. Manipulation with annotations is not possible.
Bad information about annotation author.
Unknown user group.
Unknown user group in annotation type. Group will be set another way.
Deletion of used types is not allowed. You must delete annotations first.
Server internal error caused that data has not been saved.
Duplicit annotation type URI.
Description of text modification is bad.
Deletion of used types is not allowed. You must delete subtypes first.
Ambiguous fragment (found in more than one place in document).
Bad URI of annotated document.
Bad description of annotation.
Added annotation type ancestors are malformed - some ancestors omitted.
Added annotation type URI is malformed or lack of informations provided to it's creation.
You can't join to group with administrators. Only administrator can do that.
You can't leave group with administrators, because You are the last administrator.
Some annotations should be updated, but this changes can't be saved.
Unable to connect to the NLP server.
Server internal error caused that some part of data has not been saved.
Another client is holding a different version of this document.
Unknown error.

In case of annotation will be superseded (removed or moved to level of whole document), but without error, there should be a way to inform the user. For this reason, there are warning messages sent by server. Client should display the warning to the user (near affected annotation or in a side bar or in any dialog window).

Warning message have the following syntax:

<warning number="1" annotation="http://example.com/annotations/123456">
  <![CDATA[
    Annotation superseded.
  ]]>
</warning>

Each warning have a number (attribute number) and text content to display to the user. If warning is bound to concrete annotation, element warning have attribute annotation with URI of given annotation.

The numbers of warnings are following:

Annotation superseded.
Annotation orphaned.
Annotation automatically updated.
Annotation partially orphaned.
You are not logged in. You can only log in or disconnect (other messages will be discarded).
Annotated fragments were updated.
Other warning (server internal error).

Confirmation without data supplied

If message which requests execution of some operation was send, it's necessary to reply to it (for confirmation of message delivery and succesfull execution). Because we have a implicit confirmation, reply can contain only informations which are consequences of executed operation or other usefull data (if no errors send, all was succeed). If there are no usefull data to reply (eg. if annotation was successfully removed, there is no usefull data to reply), there must be anything to send to confirm successfull execution of requested operation. In this case server sends the following message:

<ok/>

This message can be used by server to prevent connection timeout on comet channel too.

Simplified example of communication between client and server

Client (start of communication, specifying of protocol version, autentization):

  <connect/>
  <login/>

Server (confirmation, specifying of protocol version, settings of parameters):

  <connected/>
  <logged/>
  <settings/>

Client (subscriptions to annotation taking, synchronization of annotated document, request for tree with types of annotations):

  <session/>
  <subscribe/>
  <synchronize/>
  <queryTypes/>

Server (URI of synchronized resource (server copy), requested annotations and types of annotations):

  <synchronized/>
  <annotations/>
  <types><add/></types>

Client (request for adding of type of annotation):

  <session/>
  <types><add/></types>

Server (returns just added types for simplifying pair off those in comet channel with those was send (URIs may be completed at the server)):

  <types><add/></types>

Server (send just added types) - on comet channel:

  <types><add/></types>

Client (send added annotation):

  <session/>
  <annotations/>

Server:

  <ok/>

Server (Send back just added annotation with identificators assigned):

  <annotations/>

Client (send changed annotation):

  <session/>
  <annotations/>

Server (returns error message):

  <errors/>

Client (requests for original content of the annotation):

  <session/>
  <reload/>

Server (returns requested annotation):

  <annotations/>

Client (adding of an annotation):

  <session/>
  <annotations/>

Server (synchronization error):

  <errors/>
  <resynchronize/>

Client (resynchronization):

  <session/>
  <resynchronize/>

Server (all annotations of given text from selected (subscribed) resources):

  <annotations/>

Client (request for change of settings):

  <session/>
  <settings/>

Server (actual settings):

  <settings/>

Client (request for suggested annotations):

  <session/>
  <suggestAnnotations/>

Server (annotation suggestions):

  <suggestions/>

Client (synchronization (transition to) of another annotated document):

  <session/>
  <synchronize/>

Server:

  <synchronized/>
  <annotations/>

Client (log out, disconnect):

  <session/>
  <logout/>
  <disconnect/>

Server:

  <ok/>

Example of complete message from client to server

<?xml version="1.0" encoding="utf-8" ?>
<messages xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
  xmlns:a="http://nlp.fit.vutbr.cz/annotations/AnnotXMLSchema">
  <session id="0"/>
  <annotations>
    <add>
      <annotation>
        <rdf:Description>
          <rdf:type rdf:resource="http://example.com/types/g1/Person"/>
          <a:dateTime rdf:value="2011-02-16T22:00:00Z"/>
          <a:author id="http://example.com/authors/1" name="Jaroslav Dytrych" address="idytrych@fit.vutbr.cz"/>
          <a:source rdf:resource="http://example.com/documents/getDoc?id=1"/>
          <a:fragment>
            <a:path>/html/body/p[6]/text()</a:path>
            <a:offset>8</a:offset>
            <a:length>8</a:length>
            <a:annotatedText>idytrych</a:annotatedText>
          </a:fragment>
          <a:content>
          <![CDATA[Test annotation type Person with attributes
]]>
          </a:content>
          <a:attribute name="Name" type="String" rdf:value="Jaroslav Dytrych"/>
          <a:attribute name="Email" type="String" rdf:value="idytrych@fit.vutbr.cz"/>
          <a:attribute name="Age" type="Integer" rdf:value="26"/>
          <a:attribute name="Work" type="geoPoint">
            <geo:Point> <geo:lat>49.2269</geo:lat> <geo:long>16.5956</geo:long> </geo:Point>
          </a:attribute>
        </rdf:Description>
      </annotation>
    </add>
  </annotations>
</messages>

Protocol for exporting annotations to the external databases version 1.1

External databases needs informations about specific types of annotations but do not need all informations which are sent to 4A clients. So only selected messages are included in the export protocol (annotations, ok). However, external databases aren't satisfied with the informations contained in the selected messages to the annotation client because they needs informations about contexts which are send in messages which are not exported or which are not needed by the clients. Therefore, it was added some new informations into <annotation> message.

Commentary and list of ancestors for annotation type

As first, there was added parameter ontologyUri into <rdf:type> tag. This parameter is optional and represents URI in ontology for the annotation type. For each annotation (annotation type) sended to the external database must be generated list of annotation type ancestors. This must be also applied at nested annotations. List of ancestors is encased by <ancestorsList></ancestorsList> tags, items of list of ancestors are placed between these tags. Ancestors list item have <ancestorType/> tag and contains a uri="" attribute and optional ontologyUri attribute. These parameters are URI of annotation type and URI of annotation type in ontology. Here is example of annotation type ancestors list:

<ancestorsList>
	<ancestorType uri="http://example.com/Annotations/types/g1/Thing/Abstract"/>
	<ancestorType uri="http://example.com/Annotations/types/g1/Thing"/>
</ancestorsList>

Also, the behavior was modified for annotation type (tag <rdf:type>). Also comment has been added for each type of annotation. Comment is encased by <comment></comment> tags and must be in a CDATA section. Comment is optional. Here is an example of the annotation type sent to the external database:

<rdf:type rdf:resource="http://example.com/Annotations/types/g1/Thing/Abstract" ontologyUri="http://ontologyuri.com/uri">
	<ancestorsList>
		<ancestorType uri="http://example.com/Annotations/types/g1/Thing/Abstract"/>
		<ancestorType uri="http://example.com/Annotations/types/g1/Thing"/>
	</ancestorsList>
	<comment>
		<![CDATA[ ... ]]>
	</comment>
</rdf:type>

Address of the annotated document

Information about the original URI of the document has been added to <source> tag. Parameter uri="" was added to <source> tag. This parameter specifies the real URI of the document. Here is an example of a document that is represented by the address http://example.com/Annotations/documents/getDoc?id=6 on the 4A server and http://example.com/decipher-3/1320 is original addres of document:

...
<a:source rdf:resource="http://example.com/Annotations/documents/getDoc?id=6" uri="http://example.com/decipher-3/1320"/>
...

Fragment is always linearized

Fragments of document must be linearized because for external database interface is easier to work with linearized document fragments. For addition external database will not update the annotations autonomously so it is not needed to have so robust positioning in it.

Aditional attributes for attribute of annotation

Two new attributes have been added to the element <a:attribute/>. The first attribute is typeOntologyUri="", this attribute is the URI of ontology entity for the annotation type. Addresses of the primitive types are taken from the XSD primitive types. The second attribute is ontologyUri="", which value is the URI of the attribute in ontology. There is example of annotation type attribute with new attributes:

...
<a:attribute name="Boolean parameter" typeOntologyUri="http://www.w3.org/2001/XMLSchema#boolean" ontologyUri="http://example.com/ontology/1234" type="Boolean" rdf:value="false"/>
...

Aditional informations about communication

External database interface must always answer with a message <?xml version="1.0" encoding="utf-8"?><ok/>. 4A server stores all messages for the interface when the external database (also client) is unavailable or responds incorrectly. Saved messages are sent when the interface is available.

Only informations about annotations (creating, deleting or modifying) should be send to the external database by this protocol. When occurs annotation that have annotationLink attribute, server must send annotation that is linked by this attribute. Also, if there are annotations which has an annotation that will be sent to the client as value in the attribute of type nested annotation, that annotation must be send too. These rules are also applies to the newly added annotation by these rules.

4A Annotation Format

Format of an annotation is based on RDF/XML where subject is always annotation. But there are some simplifications for simplifying implementation of clients and server. So it's not valid RDF/XML, but it can be transformed (exported) to valid RDF/XML easily.

Parts of the annotation are:

type of annotation
date and time of creation
author
URI of annotated document (more precisely copy of this document on the server)
annotated fragments
textual content of the annotation
attributes

Annotations are structured by using of attributes. Attributes creates inner structure of annotation and can be used for creating a complex structures from more annotations too. Each attribute have name, type and value. Attributes can have values of simple types (see above to attributes of types of annotations) or values of types of annotations (nested annotations or links to other annotations). Links to other annotations can be used for connecting to another annotation, grouping of annotations etc. Annotation can have any number of attributes and if value of attribute is a nested annotation, it can have any number of attributes too. Level of nesting is not restricted.

Types of annotations are hierarchical and they are organized to the tree structure. But since protocol version 1.1 there is a multiple inheritance which makes general acyclic graph from this structure. But the tree is still under it and other edges of the graph are only virtual (they are only attributes of type of annotation and can't be stored in the annotation directly). Basic types can be common types of annotations (eg. note, description, comment etc.) and basic types of entities (thing, person, animal etc.). Users then can create new subtypes, so they can create a complex tree of types. This makes it possible to distinguish types of annotated fragments and not only types of annotations. Types then can be used as tags and user can use annotations only for tagging.

Each user group shares own tree of types of annotations. Globally it's one tree in which individual user groups are creating own branches. Types are identificated by user group, position in the tree and by its name in this global tree.

If type is filled in attribute or text area, it is identified by its linearized name or by URI. Both kinds are alternatives which can be converted both directions, but only in one user group. URI ends with user group followed by path in the tree of types, where individual types are separated by slash. Linearized name is the path in the tree of types, where individual types are separated by the following sequence of characters " -> ". In linearized name user group is not specified, because there is assumption that it will be set in another form field or in the settings (parameter DefaultGroup, see above). Linearized name is more user friendly and it should be displayed to user wherever it is possible. URI is unambiguous and contains user group so it should be used for communication between client and server ar for storing of the annotations.

Author of the annotation is identified by his system unique identifier (URI), which have assigned on the server. Displayed name (full name or nickname) will be in attribute name and in other attributes can be some additional information (email, URI of user photo etc.)

URI of annotated document (attribute rdf:resource of element a:source) identifies copy of annotated document stored on the server. Before annotation starts, client sends copy of content of annotated document to the server with original URI of this document (see above) and server returns URI for use in annotations. This makes it possible to clear URI from session ids etc.

For robust positioning of annotated fragment, individual fragments are described by XPath of DOM node, offset, length of annotated text and by textual content of the fragment. If we are working with linearized document, XPath is not included and position is described only by offset, length and by textual content of the fragment. Path to annotated fragment unambiguously points to concrete area of document (eg. heading, list item, table cell etc.) in which annotated fragment is. Offset and length determines concrete piece of text. Text content of the fragment must be included too, for case that it will be changed. If fragment will not found after change of document, it will be marked with attribute valid with value false, but it will be still included in annotation to preserve full length of original annotated text. Annotation can be assigned to more pieces of text, but in design of client from users perspective it will not be included, because it's not usefull commonly (fragments can be annotated by individual annotations or collect to attributes of another annotation, which can describe relations between individual fragments and not only their selections).

With respect to whole sentence or part of sentence is annotated in most cases, selection around multiple DOM nodes is in minor cases. But if this situation occurs, selected text will be splitted to more fragments (for individual DOM nodes) automatically. It's most robust (tolerant to changes of document) way to describe such selection.

Moving of annotation to the global level is so called orphaning. In this situation no fragments from the annotation was found in the document. If annotation have more fragments, it can be still valid for some fragment, if some fragments was not found.

Text content of the annotation contains textual information filled in by the user. Meaning can be given by type of annotation, but in most cases it will be textual note, comment, description, complementary information etc.

Example of structured annotation

<rdf:Description rdf:about="http://example.com/annotations/123456">
  <rdf:type rdf:resource="http://example.com/types/g01/annotation/task"/>
  <a:dateTime rdf:value="2011-01-01T20:00:00Z" />
  <a:author id="http://example.com/authors/123456" name="Jaroslav Dytrych" 
            address="idytrych@fit.vutbr.cz"/>
  <a:source rdf:resource="http://example.com/documents/getDoc?id=123456"/>
  <a:fragment>
    <a:path>/html/body/div[@id='container']/div[@id='main']/div[@id='post1']/DIV[2]/p[1]</a:path>
    <a:offset>22</a:offset>
    <a:length>33</a:length>
    <a:annotatedText>Faculty of Information Technology</a:annotatedText>
  </a:fragment>
  <a:content>
    <![CDATA[
        ...
      ]]>
  </a:content>
  <a:attribute name="place" type="geoPoint">
    <geo:Point> <geo:lat>55.701</geo:lat> <geo:long>12.552</geo:long> </geo:Point>
  </a:attribute>
  <a:attribute name="date" type="nestedAnnotation">
    <rdf:Description rdf:about="http://example.com/annotations/123457">
      <rdf:type rdf:resource="http://example.com/types/g01/annotation/description"/>
      <a:dateTime rdf:value="2011-01-01T20:00:00Z" />
      <a:author id="http://example.com/authors/123456" name="Jaroslav Dytrych" 
                address="idytrych@fit.vutbr.cz"/>
      <a:source rdf:resource="http://example.com/documents/getDoc?id=123456"/>
      <a:fragment>
        <a:path>/html/body/div[@id='container']/div[@id='main']/div[@id='post1']/p[1]</a:path>
        <a:offset>92</a:offset>
        <a:length>17</a:length>
        <a:annotatedText>14th January 2011</a:annotatedText>
      </a:fragment>
      <a:content>
      <![CDATA[
            ...
          ]]>
      </a:content>
      <a:attribute name="date" type="DateTime" rdf:value="2011-01-14T00:00:00Z"/>
    </rdf:Description> 
  </a:attribute>
  <a:attribute name="File1" type="Binary" rdf:value="TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVlciBhZGlwaXNjaW5nIGVsaXQsIHNlZCBkaWFtIG5vbnVtbXkgbmliaCBldWlzbW9kIHRpbmNpZHVudCB1dCBsYW9yZWV0IGRvbG9yZSBtYWduYSBhbGlxdWFtIGVyYXQgdm9sdXRwYXQuIA0KVXQg"/> 
  <a:attribute name="txt" type="Text"> 
    <a:Content>
      <![CDATA[Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat.]]> 
    </a:Content> 
  </a:attribute> 
  <a:attribute name="reason" type="annotationLink" uri="http://example.com/annotations/1234567"/>
</rdf:Description>

Table of contents