%ents;
]>
Accessing a RDBMS in a generic fashion is a complex and difficult
task. Consequently, this will not be an attempt to XMLize a generic
Database API or query language. Instead, it will providing a
simple mechanism for a JID to read/write data that it has access to
and specifying a model for those schemas to use in xml. This document has two aims. Although designed for use with an RDBMS this document is not
restricted to such uses. It may be used with any data storage
system that can be broken down to a simple table, column/row
format. for example comma delimited files. To understand the following sections of this document the reader
must be aware of the following. The current namespace of http://openaether.org/projects/jabber_database.html
will be used until this becomes a jep. Once officially accepted as
a jep and approved as final by the council, it will become
http://www.xmpp.org/extensions/xep-0043.html. There are a limited subset of data types available: All SQL/RDBMS units will be scoped in the xml hierarchy: All examples will assume the existence of the following rdbms setup. A
database named 'testdb' with tables created with following SQL
script: This is a simple request to discover what tables/procedures
exist on the database testdb. And what permissions are available
to the user. All schema requests will respond within the scope that
was asked for. This is to prevent unnecessary data from flooding
the network. So the response for the above request would look
something like: The response is scoped to only the 'children' of the request.
Since the request was for the testdb database, only the tables
within that database were returned in the result. The reason for
the limitation is to prevent excessively large packets from filling
the network from large schemas. The response indicates that the user has both read and write
permissions on the table 'tbl_one' and only read permissions on
the table 'tbl_two'. Consequently, the user may only perform get
requests on 'tbl_two'. The response would look like: The schema response for tbl_one is quite intuitive. Three
columns exist, one called a_int of type int (integer), another
a_float of type float and a third called a_char of type char
with a size of ten characters. Manipulation of data (select, insert, update, delete) will
definitely not be elegant or easy. SQL allows for some fairly
complex queries on any fully functional RDBMS. Consequently,
the data manipulation will be relatively limited since it is
not a goal to translate SQL into xml. To indicate a select like query, specify an <iq> of
type get. The table that the query is to be performed against
must be specified. The columns that are to be returned in
the result set must be scoped within the relative table.
Any attribute on the <col> element besides name will be
ignored. e.g. it is not required nor recommended to specify
the data types or the sizes while performing a get. It is also possible to specify a limit on the number of rows
returned in the result set by specifying a value for the limit
attribute. In this case a limit of two rows will be returned in the result set. The result set which is returned will contain all the rows
that met the criteria of the select. There is no schema
information beyond the column names included in the result set.
Each 'row' in the result set is scoped within the corresponding
<table> element. This allows for queries on multiple
tables to be used in one <iq> packet. It would be impractical to request the entire contents of the
table every time you needed one row or a subset of the data. You
can constrain the result set by specifying a where clause. Attributes only used in the <col> element within a
<where> element are the op (for operator) and conj for
(conjunction). The op is used for comparison operators such
as <, >, =, <>, <=, >=
<database>
<table>
<col/>
</table>
</database>
create table tbl_one
(
a_int int,
a_float float,
a_char char(10)
)
create table tbl_two
(
a_date datetime,
a_numeric numeric(9,3)
)
The conjuction attribute is used to combined constraints in the where clause
Result
Inserting or altering the stored data in anyway requires setting the type attribute to a value of set. This indicates that the user wants to perform a 'insert/update'. The differentiating factor between an insert and an update operation is whether a <where> element is used. If there is no <where> element then it must be interpreted as an insert. If a <where> element does exist, then it must be interpreted as an update.
Result
If there is no result set for the query, as in an update, insert, delete, then the response must indicate success or failure within the <table> element scope. An empty <table> element indicates success, and a <table> element containing an <error> element indicates a failure.
The insert into tbl_one succeeded since the response has an empty <table> element. However, the insert into tbl_two failed with a permission denied error. Which is indicated with a non-empty <table> element.
As stated previously, if the type attribute has a value of set and a <where> element exists, then it must be interpreted as an update.
Result
Again, if there is no result set returned by the query, then success or failure must be indicated.
If the type attribute has a value of set and there are no <col> elements scoped within the <table> element, then the query must be interpreted as a delete.
Result
Again, if a result set is not generated by a query, then success or failure must be indicated by the <table> element
Procedures, or stored procedures
The <proc> element will be used to indicate a procedure. It has similar characteristics to the <table> element. The core differences are that the <col> elements have permissions and a <result> element can be used to indicate the value returned by the procedure.
The permission attribute on a <col> element is used to indicate whether the parameter is in (read), out (write) or in/out (both).
The only result set acceptable from a procedure is that of the parameters or <col> element. If the procedure produces a result set outside of the parameters this should be ignored.
The server must be able to let the client know when an error occurs, instead of just being silent.
Code | Message | Description |
---|---|---|
399 | Invalid Database Name | Returned when the client has requested information from a database which does not exist according to the component. |
398 | Invalid Table Name | Returned when the client has requested information from a table/procedure which does not exist according to the component. |
397 | Invalid Column Name | Returned when the client has requested information from a column which does not exist according to the component. |
380 | Permission Denied on Table | Returned when the requested action is not allowed for the user on the table |
401 | Access Denied | Returned when the user does not have permission to use the component. |
If the user requests an action on a table which they do not have permission to do the following should be returned
If the user is not allowed to access the component the following should be returned
There are requirements which can be provided by other jabber components/namespaces, namely the jabber:iq:browse namespace in-place of Version Negotiation. Due to the inherent limitations of the above data retrieval mechanisms more sophisticated querying techniques might be desired. The <query> element will extend the functionality
The abilities described in the Basics section are just that, basic. To provide more flexibility and allow for the full power of SQL without xmlifying everything, a <sql> element may be implemented to provide this feature.
The <sql> element must be scoped within the <database> element.
Result
Since SQL is so flexible, the result set schema is not known until it is returned as a result of the query. Consequently, it must be sent as the first 'row' of the returned result. Each following row will be the actual data queried for.
If multiple tables are used within one SQL statement, then then name attribute within the <table> element can not be accurately denoted with a single table name. The best way to deal with this situation is to simply use a unique identifier within the scope of the <database> element. This will allow for multiple <sql> results to be scoped within the same result.
It is expected that this protocol will grow and be extended
to meet various demands. Therefore, version
negotiation
When the connection initiator, client end-user or server/transport, starts a session, it must first send the version number it expects to use, otherwise, behavior is undefined.
<iq id="000" type="get" to="db.host">
<database
xmlns="http://openaether.org/projects/jabber_database.html">
<version>0.1</version>
</database>
</iq>
Three responses are possible from the server.
<iq id="000" type="result" from="db.host">
<database
xmlns="http://openaether.org/projects/jabber_database.html">
<version>0.1</version>
</database>
</iq>
The type of 'result' indicates that the version request was
successful and if the client is satisfied with the version number,
may continue with schema requests or whatever.
<iq id="000" type="error" from="db.host">
<database
xmlns="http://openaether.org/projects/jabber_database.html"/>
</iq>
The type of 'error' indicates a failure in conforming to the
desired version number. The server may optionally send an
alternative option.
<iq id="000" type="error" from="db.host">
<database
xmlns="http://openaether.org/projects/jabber_database.html">
<version>0.2</version>
</database>
</iq>
Thanks to Russell Davis (ukscone) for fine tuning the layout and wording of this jep. It would probably have been unreadable if it wasn't for him.
<!ELEMENT version (#PCDATA)>
<!ELEMENT error (#PCDATA)>
<!ELEMENT sql(#PCDATA)>
<!ELEMENT database (table | sproc | sql | error)*>
<!ELEMENT table (col | where | error)*>
<!ELEMENT where (col+)>
<!ELEMENT col (#PCDATA)>
<!ELEMENT proc(col | result | error)*>
<!ELEMENT result (#PCDATA)>
<!ATTLIST error code CDATA #IMPLIED>
<!ATTLIST database name CDATA #IMPLIED>
<!ATTLIST table
name CDATA #IMPLIED
permission (read | write | both) #IMPLIED
limit CDATA #IMPLIED
>
<!ATTLIST proc name CDATA #IMPLIED>
<!ATTLIST col
name CDATA #IMPLIED
size CDATA #IMPLIED
op (eq | neq | lt | gt | let | get | null) #IMPLIED
conj (not | or | and ) #IMPLIED
permission (read | write | both) #IMPLIED
type (bit | tinyint | integer | utinyint | uinteger |
float | numeric | date | datetime | timestamp |
time | char | vchar | text | blob) #IMPLIED
>
Anyone care to do this?