IO Data

%ents; ]>

IO Data This specification defines an XMPP protocol extension for handling the input to and output from a remote entity. &LEGALNOTICE; 0244 Deferred Standards Track Standards Council XMPP Core XEP-0001 XEP-0030 XEP-0050 NOT_YET_ASSIGNED Johannes Wagener johannes.wagener@med.uni-muenchen.de edrin@jabber.org Egon Willighagen egonw@users.sf.net egonw@jabber.org Andreas Heusler aheusler@in.tum.de krach@jabber.org Tobias Markmann tm@ayena.de tm@ayena.de Ola Spjuth ola.spjuth@farmbio.uu.se olas@pele.farmbio.uu.se 0.1 2008-06-18 psa

Initial published version.

0.0.4 2008-06-05 jw/ew

The IO Data specific commands (procedure status and output request) were moved in the IO Data namespace. The Schema was adapted to become extensible.

0.0.3 2008-04-16 jw

Applied the suggested modifications in result to the discussion on the XMPP standards mailing list.

0.0.2 2008-03-20 jw

Added some missing namespaces in two examples.

0.0.1 2008-02-25 jw

Initial Version.

&xep0050; became a popular and widespread XMPP Protocol Extension to execute functions on a remote systems. It is supported by many XMPP client and service implementations. To date almost all of its implementations rely on &xep0004; to be the data container. However Ad-Hoc Commands is explicitly designed and mentioned to be used in combination with other data containers, too. This applies for the cases where the Data Forms specification does not fit the needs, for example the Data Forms can be too restrictive on strong typing of data (see Section 1.2).

The intention of the present XEP is to define a data container for the cases where Data Forms is not applicable or not optimal. The data container defined herein (IO Data) is very generic and discoverable. It is intended to be used for other purposes than Data Forms.

The Data Forms data container has certain restrictive limitations:

The supported Field Types are limited: text input fields, drop down boxes, different selectable optional values, etc. See Data Forms - Field Types
The only allowed content type of the fields is xs:string. See Data Forms - XML Schema
According to current specifications it is not possible to transport complex tree-based data structures. For example nested elements of elements cannot have nested elements at all, therefore lacking an XML key feature.

The limitations of Data Forms are not bad. They are good for the special use case a client has to render a graphical representation of the service. In HTML the correlative is a HTML form. For a chat client developer this makes it plain and simple to develop a generic graphical client implementation with some simple text-input fields.

According to current standards it is not supported to encapsulate more complex data in the Data Forms data container. For example it is not possible to encapsulate a complete XML Document - the real "generic" data container - in the Data Forms data container, unless you encode the XML Document as xs:string – which would be considered bad practice.

However specialized clients are developed to make use of the service oriented architecture of XMPP. An example is given here: a XMPP client implementation reflects an Application Programming Interface (API) with an XMPP services by making use of Ad-Hoc Commands.

The limitations of Data Forms make it impossible to define and handshake these actions clearly and precisely and without confusing existing and future implementations for the following reasons:

Data Forms does not support a "Schemata Discovery". The form descriptor Data Forms provides (type='form') is not separated from the data transaction according to the Ad-Hoc Command logic descibed in XEP-0050. Therefore each function invocation would result in a form descriptor submission again causing unnecessary traffic. It is sufficient to discover the IO Schemata once.
It is not suggested to encapsulate XML Documents in the Data Forms in general.

Beside Ad-Hoc Commands two other XEPs exist that provide mechanisms to execute a function on a remote system. For this count &xep0009; and &xep0072;.

However, Jabber-RPC and SOAP over XMPP lack certain functionality that is important for flexible, simple and robust Web Services. Because of the limited expressiveness of XML-RPCs data types the Jabber-RPC is not suitable for complex functionality, similar to the limitations of Data Forms. While SOAP over XMPP supports complex data types it lacks an obvious mechanism for asynchronous usage. For example it has no default stateful design: there is no sessionid like in Ad-Hoc Commands. Beside this SOAP brings in severe complexity (XML associated abstractions) that was required for the primary transport layer HTTP. This complexity is not required because XMPP does already implement the required XML associated abstractions. In addition to that there are other issues that argument against SOAP. For example to date most HTTP SOAP implemented services are only compatible with a subset of SOAP libraries.

In contrast Ad-Hoc Commands comprises simple, clean and optionally stateful Web Service mechanisms by default. In addition to that asynchronous client notification can be achieved with a <message>, as indicated in Ad-Hoc Commands and as realized in some unofficial implementations.

In conclusion and as already suggested in Ad-Hoc Commands we describe an alternative data container. This data container is more generic in the way it can be used:

It supports a "Schemata Discovery". Thus a client implementation can marshal an API for the input and output (and optionally for a service specific error) of a certain service.
This "Schemata Discovery" is separated from the data transaction. This reduces the amount of unnecessary traffic.
The Field Types of the described data container are on the one hand clearly defined (there is only description, input, output, error, and status) and on the other hand straightforward. Thus any kind of XML data (XML Document with namespaces that represent any imaginable data object) can be submitted.

It is important to note that this XEP does not intent to replace or extent Data Forms. Also it does not break any current Ad-Hoc implementations. It just intends to offer another data container that fits much better under some circumstances where no GUI is rendered around an Ad-Hoc Command service.

The base syntax for the 'urn:xmpp:tmp:io-data' namespace is as follows &NSNOTE;; a formal description can be found in the XML Schema section below.

]]>

Transaction Type	Purpose	Associated Ad-Hoc Command	REQUIRED for generic XEP compatibility	Contained Elements
io-schemata-get	To request the schemata of input and output.	execute	yes	-
input	To submit the input.	execute	yes	<in>
getStatus	To request the status of the procedure.	next	yes	-
getOutput	To request the output.	next, complete	yes	-

Transaction Type	Purpose	Associated Ad-Hoc Command status value	REQUIRED for generic XEP compatibility	Contained Elements
io-schemata-result	To return the schemata of input and output.	completed	yes	<desc> <in> <out>
output	To submit the output.	executing, completed	yes	<out>
error	To submit additional error information.	executing	no	<error>
status	To indicate the current status of the procedure.	executing	no	<status>

<desc> -- a textual description of the IO Data data container (xs:string).

<in> -- contains the input. Valid for Transaction Type 'input' and 'io-schemata-result' only. May contain any XML data (XML Schema, XML Document ...).

<out> -- contains the output. Valid for Transaction Type 'output' and 'io-schemata-result' only. May contain any XML data (XML Schema, XML Document ...).

<error> -- describes the error raised by the procedure invocation. This element is optional and valid for Transaction Type 'error' and 'io-schemata-result' only. May contain any XML data (XML Schema, XML Document ...).

<status> -- describes the status of the procedure. This element is optional and valid for Transaction Type 'status' only.

<elapsed> -- an integer value of the time in milliseconds that elapsed since the procedure was invoked (xs:integer).

<remaining> -- an integer value of the (estimated) time in milliseconds till the procedure will finish (xs:integer).

<percentage> -- the percentage of the procedure that is finished (xs:integer).

<information> -- describes the current status of the procedure.

Commands (= remote procedures) executed with Ad-Hoc Commands and IO Data SHOULD NOT keep the requester in an uncertain state. This means the responder SHOULD respond to the requester always as fast as possible. Thereby the requester acquires the sessionid. (As some remote procedures/calculations are cost-intensive and/or time-consuming the requester MUST "save" this sessionid for the case a network problem occurs.)

The Ad-Hoc Command logic applied for the IO Data data container should be associated with the following rules and keywords:

Ad-Hoc Command	Keyword	Associated Transaction Type	Subsequently allowed commands	Status description
execute	Get Schemata	io-schemata-get	-	XML Schemata are returned immediately
execute	Start procedure	input	-	output returns immediately (synchronous)
	Start procedure	input	next, cancel	asynchronous procedure was invoked
next	Check status	getStatus	next, cancel	asynchronous procedure not finished
	Check status	getStatus	next, complete, cancel	asynchronous procedure finished
	Get result	getOutput	next, complete, cancel	result was delivered
cancel	Cancel/delete procedure	-	-	procedure terminated
complete	Get result + delete procedure	-	-	result was delivered, procedure terminated

If a service can return the output immediately, it MAY respond with status='completed' and return the output (IO Data type='output'). This behavior is NOT RECOMMENDED for procedures that need more than 5 seconds to complete or that are cost-intensive.

If a service cannot return the result immediately (this refers to procedures that need more than 5 seconds to complete) or the invoked procedure is cost-intensive, it SHOULD response with status='executing' and a <actions> element containing the <next> element.
If the service returned status='executing' the requester MAY stay up-to-date by proceeding with action='next' combined with the IO Data transaction type='getStatus'. The responder MUST respond with status='executing' and a <actions> element containing the <next> element only as long as the procedure is not finished.
If the procedure finished the responder MUST respond to this request (action='next') combined with the IO Data transaction type='getStatus' with status='executing' and a <actions> element containing the <next> and the <complete> elements to indicate that the output is ready for collection. The requester MAY then request the result by proceeding with action='complete' or action='next' combined with the IO Data transaction type='getOutput'.
Asynchronous notification: If the procedure finished the service MUST actively notify the requester by sending a message containing an Ad-Hoc Command element with status='executing' and a <actions> element containing the <next> and the <complete> elements to indicate that the result is ready for collection.
If the requester requests the output with action='complete' the responder MUST return the result (IO Data transaction type='output') with status='completed'. This means the Ad-Hoc Command session terminated. The responder MUST subsequently delete associated procedure and result.
If the requester requests the output with action='next' combined with the IO Data transaction type='getOutput' the responder MUST return the result (IO Data transaction type='output') with status='executing' and a <actions> element containing the <next> and the <complete> elements to indicate that the the Ad-Hoc Command session continues to exist and the output is still available. The requester MUST subsequently delete the associated procedure and result with action='cancel'.

Beside the errors that are associated with IQ or Ad-Hoc Command abstraction layer an internal procedure error may occur.

If the procedure invocation fails (an error occurs) the responder MUST respond with status='completed'. To indicate that the procedure failed the <note> element MUST have type='error' as described in XEP 50 Ad-Hoc Commands. The service may provide additional error information within the IO Data data container (IO Data transaction type='error').
Asynchronous implementation only: If the service returned status='executing' (asynchronous implementation) and the procedure fails (an error occurs) the service MUST actively notify the requester by sending a message containing an Ad-Hoc Command element with status='executing' and a <actions> element containing the <next> element to the invoker. To indicate that the procedure failed the <note> element MUST have type='error' as described in XEP 50 Ad-Hoc Commands. The service may provide additional error information within the IO Data data container (IO Data transaction type='error'). The requester SHOULD subsequently delete the associated procedure with action='cancel'.
Asynchronous implementation only: If the procedure failed (an error occurs) the responder MUST respond to a status request (action='next') with status='executing' and a <actions> element containing the <next> element to the requester. To indicate that the procedure failed the <note> element MUST have type='error' as described in XEP 50 Ad-Hoc Commands. The service may provide additional error information within the IO Data data container (IO Data transaction type='error'). The requester SHOULD subsequently delete the associated procedure with action='cancel'.

As long as the procedure did not finish (!) the service MAY provide additional status information within the IO Data data container (IO Data transaction type='status').

Formalising machine to machine commands using the namespace defined herein, making such commands detectable and usable on-the-fly without the prerequisite for the requester to know the exact interface on the service site and the support for asynchronous as well as synchronous execution contributes to the usability of XMPP for complex grid-computing projects.

In example an IDE could support the development of such projects by generating code interfaces (client stubs) to machine to machine capable XMPP services by discovering and requesting all required information on-the-fly.

The requester can query for disco information on the command (Ad-Hoc Command) node to find out if it supports IO Data based commands.

]]> ]]>

To indicate support for IO Data it MUST include <feature var='urn:xmpp:tmp:io-data'/>. Of course the node can still provide <feature var='jabber:x:data'/> if this is supported, too.

The 'in' and 'out' elements may each have any valid XML encoded elements as children. From a XML document style type of view <in/> and <out/> may be seen as root elements. Therefore it is required to "discover" the XML Schemata of the "dynamic children" of <in/> and <out/> (IO Schemata). This way a requester can marshal an API for the input and output of a certain service.

Beside the 'in' and 'out' elements an 'error' element is optionally allowed and would be discovered in exactly the same. It is not included in the example to keep it simple.

The XML Schemata request is done by setting the type of the IO Data element to 'io-schemata-get'.

]]> This service returns 3D atomic coordinates for the input structure. The input and output is encoded using the Chemical Markup Language (CML). ]]>

This service example requires the content of <in/> and <out/> to be Chemical Markup Language The Chemical Markup Language: <http://www.xml-cml.org/>. by requiring input with the namespace 'http://www.xml-cml.org/schema'. Additionally, it also defines the returned output to be Chemical Markup Language.

To keep the example simple the children of the 'in' and 'out' elements just contain strings (the protein name and protein sequence). However in real use cases it is likely that the children of 'in' and 'out' contain very complex XML documents with many different valid elements, namespaces, or values.

The requester transmits the input to the service (responder) by setting the type of the IO Data element to 'input'.

CAB08284 ]]>

The service transmits the output to the requester by setting the type of the IO Data element to 'output'.

mrkhpqsatk hlfvsggvas slgkgltass lgqlltargl hvtmqkldpy lnvdpgtmnp fqhgevfvte dgaetdldvg hyerfldrdl sgsanvttgq vystviaker rgeylgdtvq viphitdeik qrimamaqpd ggdnrpdvvi teiggtvgdi esqpfleaar qvrhdlgren vfflhvslvp hlapsgelkt kptqhsvaal rsigitpdal ilrcdrdvpe slknkialmc dvdidgvist pdapsiydip kvlhreelda fvvrrlnlpf rdvdwtewdd llrrvhephg tvrialvgky vdfsdaylsv sealhaggfk hyakvevvwv asddcetatg aaavladvhg vlipggfgir giegkigair yararglpvl glclglqciv ieatrsvglv qansaefepa tpdpvistma dqkeivagea dfggtmrlga ypavlqpasi vaqaygttqv serhrhryev nnayrdwiae sglrisgtsp dgylvefvey panmhpfvvg tqahpelksr ptrphplfva fvgaaidyks aellpveipa vpeisehlpn ssnqhrdgve rsfpapaarg ]]>

In this example the Ad-Hoc Command is a time-consuming and cost-intensive computation service. To keep the example simple the computation is a WAV to MP3 encoder - the input and output elements of this example make use of &xep0231;.

[ ... base64-encoded-audio ... ] ]]>

The service notifies the requester that the job is accepted: status='executing' and a <actions> element contains the <next> element.

WAV to MP3 encoding has been started. You may stay up to date using the next-action. ]]>

The requester MAY stay up-to-date by proceeding with action='next' combined with the IO Data transaction type='getStatus'.

]]>

The service returns the status of the procedure. The "still calculating"-status is indicated with the <actions> element that contains the <next> element only. The "calculation finished"-status is indicated with the <actions> element that contains the <next> and <complete> elements.

Optionally the result MAY contain additional status information within the IO Data element with IO Data transaction type='status' although is not shown here to keep the example simple.

]]>

If the procedure is complete the service notifies the invoker with a message stanza containing an Ad-Hoc Command namespace with status='executing' and a <actions> element that contains the <next> and <complete> elements. The <complete> element indicates the calculation finished.

WAV to MP3 encoding finished. You may request the output now. ]]>

After that the requester can request the output with the Ad-Hoc Command action='complete'.

]]>

The service returns the MP3 within the IO Data element. The status of the Ad-Hoc Command completed (status='completed').

[ ... base64-encoded-audio ... ] ]]>

Alternatively the requester can request the output with the Ad-Hoc Command action='next' combined with the IO Data transaction type='getOutput'. This will keep the Ad-Hoc Command session alive and it must be deleted subsequently. This design allows to recover from network breakage during the result transmission state of the client-server communication, but allowing to request receiving the computation results or second time, because the session was left open after the first request.

]]>

The service returns the MP3 within the IO Data element. The status of the Ad-Hoc Command remains active (status='executing').

[ ... base64-encoded-audio ... ] ]]>

The requester MUST subsequently delete the remote procedure with the Ad-Hoc Command action='cancel'.

]]>

The remote procedure is deleted.

]]>

In case of an error the service the service notifies the invoker with a message stanza containing an Ad-Hoc Command namespace with status='executing' and a <actions> element that contains the <next> element. In addition to that it MUST contain a <note> element with type='error' to indicate the error.

The error notification MAY contain additional error information within the IO Data element with IO Data transaction type='error'.

#593 - The encoder could not parse the file. 593 The encoder could not parse the file. ]]>

In case of an error the service would respond to a status request (Ad-Hoc Command action='next' combined with the IO Data transaction type='getStatus') in a very similar way except that a <iq> and not a <message> would be used.

An asynchronous remote procedure may be canceled (deleted) by the invoker at any time.

]]>

The remote procedure is deleted.

]]>

Error codes on the Ad-Hoc Command abstraction layer are inherited from Ad-Hoc Commands.

Application specific errors associated with a remote procedure call realized with IO Data in combination with Ad-Hoc Commands were described in section 3 - Implementation notes.

Internationalization of messages sent by the server is covered by setting the @xml:lang attribute of the <iq> element. Services should reply in the same language in which the client asked the question. That is, if the client specifies a locale using the @xml:lang attribute on the <iq> element, then the server should reply in the same locale, and localize messages given in <desc>, <node>@info and <query><item>@name.

To follow.

This document requires no interaction with &IANA;.

Until this specification advances to a status of Draft, its associated namespace shall be "urn:xmpp:tmp:io-data"; upon advancement of this specification, the ®ISTRAR; shall issue a permanent namespace in accordance with the process defined in Section 4 of &xep0053;.

]]>

The Bioclipse Project