%ents; ]>
Internet of Things - Sensor Data This specification provides the common framework for sensor data interchange over XMPP networks. 0323 Experimental Standards Track Standards Council XMPP Core XEP-0001 XEP-0030 NOT_YET_ASSIGNED Peter Waher peter.waher@clayster.com peter.waher@jabber.org http://se.linkedin.com/pub/peter-waher/1a/71b/a29/ 0.1 2013-04-16 psa

Initial published version approved by the XMPP Council.

0.0.5 2013-04-01 pwa

Added resource information of original called to their corresponding JIDs.

Changed the return type of a rejected message.

Made images inline.

Converted the glossary into a definition list.

0.0.4 2013-03-18 pwa

Added information about how to read sensors from large subsystems.

Added support for client/device provisioning tokens.

0.0.3 2013-03-11 pwa

Changed time point to timestamp everywhere.

Corrected some errors in the text.

Made the accepted response optional.

0.0.2 2013-03-09 pwa

Corrected some errors in XML examples.

English corrected.

Added errors elements to the rejected element.

Added cancel command with corresponding cancelled response.

0.0.1 2013-03-07 pwa

First draft.

This XEP provides the underlying architecture, basic operations and data structures for sensor data communication over XMPP networks. It includes a hardware abstraction model, removing any technical detail implemented in underlying technologies.

Note has to be taken, that these XEP's are designed for implementation in sensors, many of which have very limited amount of memory (both RAM and ROM) or resources (processing power). Therefore, simplicity is of utmost importance. Furthermore, sensor networks can become huge, easily with millions of devices in peer-to-peer networks.

Sensor networks contains many different architectures and use cases. For this reason, the sensor network standards have been divided into multiple XEPs according to the following table:

XEP Description
XEP-0000-ColorParameter Defines extensions for how color parameters can be handled, based on &xep0004;
XEP-0000-DynamicForms Defines extensions for how dynamic forms can be created, based on &xep0004;, &xep0122;, &xep0137; and &xep0141;.
exi Defines how to EXI can be used in XMPP to achieve efficient compression of data. Albeit not a sensor network specific XEP, this XEP should be considered in all sensor network implementations where memory and packet size is an issue.
xep-0000-SN-BatteryPoweredSensors Defines how to handle the peculiars related to battery powered devices, and other devices intermittently available on the network.
xep-0000-SN-Concentrators Defines how to handle architectures containing concentrators or servers handling multiple sensors.
xep-0000-SN-Control Defines how to control actuators and other devices in sensor networks.
xep-0000-SN-Discovery Defines the peculiars of sensor discovery in sensor networks. Apart from discovering sensors by JID, it also defines how to discover sensors based on location, etc.
xep-0000-SN-Events Defines how sensors send events, how event subscription, hysteresis levels, etc., are configured.
xep-0000-SN-Interoperability Defines guidelines for how to achieve interoperability in sensor networks, publishing interoperability interfaces for different types of devices.
xep-0000-SN-Multicast Defines how sensor data can be multicast in efficient ways.
sensor-network-provisioning Defines how provisioning, the management of access privileges, etc., can be efficiently and easily implemented.
xep-0000-SN-PubSub Defines how efficient publication of sensor data can be made in sensor networks.
sensor-data This specification. Provides the underlying architecture, basic operations and data structures for sensor data communication over XMPP networks. It includes a hardware abstraction model, removing any technical detail implemented in underlying technologies. This XEP is used by all other sensor network XEPs.

The following table lists common terms and corresponding descriptions.

Actuator
Device containing at least one configurable property or output that can and should be controlled by some other entity or device.
Computed Value
A value that is computed instead of measured.
Concentrator
Device managing a set of devices which it publishes on the XMPP network.
Field
One item of sensor data. Contains information about: Node, Field Name, Value, Precision, Unit, Value Type, Status, Timestamp, Localization information, etc. Fields should be unique within the triple (Node ID, Field Name, Timestamp).
Field Name
Name of a field of sensor data. Examples: Energy, Volume, Flow, Power, etc.
Field Type
What type of value the field represents. Examples: Momentary Value, Status Value, Identification Value, Calculated Value, Peak Value, Historical Value, etc.
Historical Value
A value stored in memory from a previous timestamp.
Identification Value
A value that can be used for identification. (Serial numbers, meter IDs, locations, names, etc.)
Localization information
Optional information for a field, allowing the sensor to control how the information should be presented to human viewers.
Meter
A device possible containing multiple sensors, used in metering applications. Examples: Electricity meter, Water Meter, Heat Meter, Cooling Meter, etc.
Momentary Value
A momentary value represents a value measured at the time of the read-out.
Node
Graphs contain nodes and edges between nodes. In Sensor Networks, sensors, actuators, meters, devices, gatewats, etc., are often depicted as nodes and links between sensors (friendships) are depicted as edges. In abstract terms, it's easier to talk about a Node, than have to list different types of nodes possible (sensors, actuators, meters, devices, gateways, etc.). Each Node has a Node ID.
Node ID
An ID uniquelly identifying a node within its corresponding context. If a globally unique ID is desired, an architechture should be used using a universally accepted ID scheme.
Parameter
Readable and/or writable property on a node/device. The XEP xep-0000-SN-Concentrators deals with reading and writing parameters on nodes/devices. Fields are not parameters, and parameters are not fields.
Peak Value
A maximum or minimum value during a given period.
Precision
In physics, precision determines the number of digits of precision. In sensor networks however, this definition is not easily applicable. Instead, precision determines, for example, the number of decimals of precision, or power of precision. Example: 123.200 MWh contains 3 decimals of precision. All entities parsing and delivering field information in sensor networks should always retain the number of decimals in a message.
Sensor
Device measuring at least one digital value (0 or 1) or analog value (value with precision and physical unit). Examples: Temperature sensor, pressure sensor, etc. Sensor values are reported as fields during read-out. Each sensor has a unique Node ID.
SN
Sensor Network. A network consisting, but not limited to sensors, where transport and use of sensor data is of primary concern. A sensor network may contain actuators, network applications, monitors, services, etc.
Status Value
A value displaying status information about something.
Timestamp
Timestamp of value, when the value was sampled or recorded.
Token
A client, device or user can get a token from a provisioning server. These tokens can be included in requeests to other entities in the network, so these entities can validate access rights with the provisioning server.
Unit
Physical unit of value. Example: MWh, l/s, etc.
Value
A field value.
Value Status
Status of field value. Contains important status information for Quality of Service purposes. Examples: Ok, Error, Warning, Time Shifted, Missing, Signed, etc.
Value Type
Can be numeric, string, boolean, Date & Time, Time Span or Enumeration.
WSN
Wireless Sensor Network, a sensor network including wireless devices.
XMPP Client
Application connected to an XMPP network, having a JID. Note that sensors, as well as applications requesting sensor data can be XMPP clients.

The most common use case for a sensor network application is meter read-out. It's performed using a request and response mechanism, as is shown in the following diagram.

The read-out request is started by the client sending a req request to the device. Here, the client selects a sequence number seqnr. It should be unique among requests made by the client. The device will use this sequence numbers in all messages sent back to the client.

The request also contains a set of field types that very roughly determine what the client wants to read. What the client actually will return will be determined by a lot of other factors, such as make and model of device, any provisioning rules provided, etc. This parameter just gives a hint on what kind of data is desired. It is implicit in the request by the context what kind of data is requested. Examples of field types are: Momentary values, peak values, historical values, computed values, status values, identification values, etc.

If reading historical values, the client can also specify an optional time range using the from and to parameter values, giving the device a hint on how much data to return.

If the client wants the read-out to be performed at a given point in time, the client can define this using the optional parameter when.

There's an optional parameter ids that the client can provide, listing a set of Node IDs. If omitted, the request includes all sensors or devices managed by the current JID. But, if the JID is controlled by a system, device or concentrator managing various devices, the ids parameter restricts the read-out to specific individuals.

Note: The device is not required to follow the hints given by the client. These are suggestions the client can use to minimize its effort to perform the read-out. The client MUST make sure the response is filtered according to original requirements by the client after the read-out response has been received.

If the device accepts the client request, it sends an accepted response back to the client. The device also has to determine if the read-out is commenced directly, or if it is to be queued for later processing. Note that the request can be queued for several reasons. The device can be busy, and queues it until it is ready to process the request. It can also queue the request if the client has requested it to be executed at a given time. If the request is queued, the device informs the client of this using the queued attribute. Note however, that the device will process the request when it can. There's no guarantee that the device will be able to process the request exactly when the client requests it.

Note: The accepted message can be omitted if the device already has the response and is ready to send it. If the client receives field data or a done message before receiving an accepted message, the client can assume the device accepted the request and omitted sending an accepted element.

If the request was queued, the device will send a message informing the client when the read-out is begun. This is done using a started message, using the same seqnr used in the original request.

Note: Sending a started element should be omitted by the device if the request is not queued on the device. If the queued attribute is omitted in the response, or has the value false, the client must not assume the device will send a started element.

During the read-out, the device sends partial results back to the client using the same seqnr as used in the request, using a fields message. These messages will contain a sequence of fields read out of the device. The client is required to filter this list according to original specifications, as the device is not required to do this filtering for the client.

When read-out is complete, the device will send a done message to the client with the same seqnr as in the original request. Since the sender of messages in the device at the time of sending might not be aware of if there are more messages to send or not, the device can send this message separately as is shown in the diagram. If the device however, knows the last message containing fields is the last, it can set a done attribute in the message, to skip this last message.

Note: There is no guarantee that the device will send a corresponding started and fields element, even though the request was accepted. The device might lose power during the process and forget the request. The client should always be aware of that devices may not respond in time, and take appropriate action accordingly (for instance, implementing a retry mechanism).

If a failure occurs while performing the read-out, a failure message is sent, instead of a corresponding fields message, as is shown in the following diagram. Apart from notifying the client that a failure to perform the read-out, or part thereof, has occurred, it also provides a list of errors that the device encountered while trying. Note that multiple fields and failure messages can be sent back to the client during the read-out.

The device can also reject a read-out request. Reasons for rejecting a request may be missing privileges defined by provisioning rules, etc. It's not part of this XEP to define such rules. A separate XEP (sensor-network-provisioning) defines an architecture for how such provisioning can be easily implemented.

A rejection response is shown in the following diagram.

If a read-out has been queued, the client can cancel the queued read-out request sending a cancel command to the device. If a reading has begin and the client sends a cancel command to the device, the device can choose if the read-out should be cancelled or completed.

Note: Remember that the seqnr value used in this command is unique only to the client making the request. The device can receive requests from multiple clients, and must make sure it differs between seqnr values from different clients. Different clients are assumed to have different values in the corresponding from attributes.

The client that wishes to receive momentary values from the sensor initiates the request using the req request sent to the device.

]]>

When the device has received and accepted the request, it responds as follows:

]]>

When read-out is complete, the response is sent as follows:

]]>

If instead a read-out could not be performed, the communication sequence might look as follows:

Timeout. ]]>

If for some reason, the device rejects the read-out request, the communication sequence might look as follows:

Access denied. ]]>

Note that the type of the returning IQ stanza is error.

The following example shows a communication sequence when a client reads out all available information from a sensor at a given point in time:

... ]]>

The following example shows how a client reads a subset of multiple sensors behind a device with a single JID.

]]>

The req element can take field sub elements, with which the client can specify which fields it is interested in. If not provided, the client is assumed to return all matching fields, regardless of field name. However, the field elements in the request object can be used as a hint which fields should be returned.

Note: the device is not required to adhere to the field limits expressed by these field elements. They are considered a hint the device can use to limit bandwidth.

The following example shows how a client can read specific fields in a device.

]]>

The following example shows how the client cancels a scheduled read-out:

]]>

If an entity supports the protocol specified herein, it MUST advertise that fact by returning a feature of "urn:xmpp:sn" in response to &xep0030; information requests.

]]> ... ... ]]>

In order for an application to determine whether an entity supports this protocol, where possible it SHOULD use the dynamic, presence-based profile of service discovery defined in &xep0115;. However, if an application has not received entity capabilities information from an entity, it SHOULD use explicit service discovery instead.

As noticed, a conscious effort has been made not to shorten element and attribute names. This is to make sure, XML is maintained readable. Packet size is not deemed to be affected negatively by this for two reasons:

  • For sensors with limited memory, or where package size is important, EXI Efficient XML Interchange (EXI) Format <http://www.w3.org/TR/exi/>. is supposed to be used. EXI compresses strings as normalized index values, making the string appear only once in the packet. Therefore, shortening string length doesn't affect packet size much. Element and attribute names in known namespaces are furthermore only encoded by index in schema, not by name.
  • If limited memory or package size is not a consideration, readability and ease of implementation is preferred to short messages.

This protocol has avoided the use of enumerations for data types such as units, field names, etc., and instead use strings. The reasons for this are:

  • Enumerations would unnecessarily restrict the use of the protocol to field names and units listed in the protocol.
  • It would be very difficult to try to create a complete set of field names and units that would suit all applications.
  • Leaving these values as strings would let developers the liberty to use units as they desire.
  • If EXI is used for compression, the use of strings will only increase payload slightly, with only one copy of each distinct value used.
  • If EXI is not used, this does not affect packet size.

However, some things need to be taken into account:

  • Since free strings are used, XML validation cannot be used to secure correct names are used.
  • xep-0000-SN-Interoperability lists recommendations on how field names and units should be used in order to achieve maximum interoperability in SN.
  • Consumers of sensor data need to include unit conversion algorithms.

Since some applications require real-time feedback (or as real-time as possible), and read-out might in certain cases take a long time, the device has the option to send multiple fields messages during read-out. The client is responsible for collecting all such messages until either a done message is sent, or a corresponding done attribute is available in one of the messages received. Only the device knows how many (if any) messages are sent in response to a read-out request.

There are different types of values that can be reported from a device. The following table lists the various types:

Element Description
numeric Represents a numerical value. Numerical values contain, apart from a numerical number, also an implicit precision (number of decimals) and an optional unit. All parties in the communication chain should retain the number of decimals used, since this contains information that is important in the interpretation of a value. For example, 10 °C is different from 10.0 °C, and very different from 10.00 °C. If a sensor delivers the value 10 °C you can assume it probably lies between 9.5 °C and 10.5 °C. But if a sensor delivers 10.00 °C, it is probably very exact (if calibrated correctly).
string Represents a string value. It contains an arbitrary string value.
boolean Represents a boolean value that can be either true or false.
dateTime Represents a date and optional time value. The value must be encoded using the xs:dateTime data type. This includes date, an optional time and optional time zone information. If time zone is not available, it is supposed to be undefined.
timeSpan Represents a time span value. This can be either a time of day value, if nonnegative and less than 24 hours, or a duration value.
enum Represents an enumeration value. What differs this value from a string value, is that it apart from the enumeration value (which is a string value), also contains a data type, which consumers can use to interpret its value. This specification does not assume knowledge of any particular enumeration data types.

There are different types of fields, apart from types of values a field can have. These types are conceptual types, similar to categories. They are not exclusive, and can be combined.

If requesting multiple field types in a request, the device must interpret this as a union of the corresponding field types and return at least all field values that contain at least one of the requested field types. Example: If requesting momentary values and historical values, devices must return both its momentary values and its historical values.

But, when a device reports a field having multiple field types, the client should interpret this as the intersection of the corresponding field types, i.e. the corresponding field has all corresponding field types. Example: A field marked as both a status value and as a historical value is in fact a historical status value.

The following table lists the different field types specified in this document:

Field Type Description
computed A value that is computed instead of measured.
historical* A value stored in memory from a previous timestamp. The suffix is used to determine period, as shown below.
historicalSecond A value stored at a second shift (milliseconds = 0).
historicalMinute A value stored at a minute shift (seconds=milliseconds=0). Are also second values.
historicalHour A value stored at a hour shift (minutes=seconds=milliseconds=0). Are also minute and second values.
historicalDay A value stored at a day shift (hours=minutes=seconds=milliseconds=0). Are also hour, minute and second values.
historicalWeek A value stored at a week shift (Monday, hours=minutes=seconds=milliseconds=0). Are also day, hour, minute and second values.
historicalMonth A value stored at a month shift (day=1, hours=minutes=seconds=milliseconds=0). Are also day, hour, minute and second values.
historicalQuarter A value stored at a quarter year shift (Month=Jan, Apr, Jul, Oct, day=1, hours=minutes=seconds=milliseconds=0). Are also month, day, hour, minute and second values.
historicalYear A value stored at a year shift (Month=Jan, day=1, hours=minutes=seconds=milliseconds=0). Are also quarter, month, day, hour, minute and second values.
historicalOther If period if historical value is not important in the request or by the device.
identity A value that can be used for identification. (Serial numbers, meter IDs, locations, names, addresses, etc.)
momentary A momentary value represents a value measured at the time of the read-out. Examples: Energy, Volume, Power, Flow, Temperature, Pressure, etc.
peak A maximum or minimum value during a given period. Examples "Temperature, Max", "Temperature, Min", etc.
status A value displaying status information about something. Examples: Health, Battery life time, Runtime, Expected life time, Signal strength, Signal quality, etc.

There are two field type attributes that can be used in requests to simplify read-out:

Field Type Description
all Reads all types of fields. It is the same as explicitly setting all field type attributes to true.
historical If period of historical values is not important, this attribute can be set to include all types of historical values.

Note: The reason for including different time periods for historical values is that these periods are common in metering applications. However, the client is not restricted to these in any way. The client can always just ask for historical values, and do filtering as necessary to read out the interval desired.

Also, devices are not required to include logic to parse and figure out what historical values are actually desired by the client. If too complicated for the device to handle, it is free to report all historical values. However, the device should limit the historical values to any interval requested, and should try to limit itself to the field types requested. Information in the request element are seen as hints that the device can use to optimize any communication required by the operation.

In metering applications where quality of service is important, a field must always be accompanied with a corresponding status flag. Devices should set these accordingly. If no status flag is set on a field, the client can assume automaticReadout is true.

Note that status flags are not exclusive. Many of them can logically be combined. Some also imply an order of importance. This should be kept in mind when trying to overwrite existing values with read values: An estimate should not overwrite a read-out, a read-out not a signed value, and a signed value not an invoiced value, etc.

Available status flags, in order of importance:

Status Flag Description
missing Value is missing
automaticEstimate An estimate of the value has been done automatically. Considered more reliable than a missing value (duh!).
manualEstimate The value has manually been estimated. Considered more reliable than an automatic estimate.
manualReadout Value has been manually read. Considered more reliable than a manual estimate.
automaticReadout Value has been automatically read. Considered more reliable than a manually read value.
timeOffset The time was offset more than allowed and corrected during the measurement period.
warning A warning was logged during the measurement period.
error An error was logged during the measurement period.
signed The value has been signed by an operator. Considered more reliable than an automatically read value. Note that the signed status flag can be used to overwrite existing values of higher importance. Example signed + invoiced can be considered more reliable than only invoiced, etc.
invoiced The value has been invoiced by an operator. Considered more reliable than a signed value.
endOfSeries The value has been marked as an end point in a series. This can be used for instance to mark the change of tenant in an apartment.
powerFailure The device recorded a power failure during the measurement period.
invoiceConfirmed The value has been invoiced by an operator and confirmed by the recipient. Considered more reliable than an invoiced value.

This document does not go into detail on how devices are ordered behind a JID. Some of the examples have assumed a single device lies behind a JID, others that multiple devices exist behind a JID. Also, no order or structure of devices has been assumed.

But it can be mentioned that it is assumed that if a client requests a read-out of a supernode, it implies the read-out of all its subnodes. Therefore, the client cannot expect read-out to be limited to the devices listed explicitly in a request, as nodes implicitly implied, as descendant nodes of the selected nodes, can also be included.

More information about how multiple devices behind a JID can be handled, is described in the XEP xep-0000-SN-Concentrators.

All examples in this document have been simplified examples where a few devices containing a few fields have been read. However, in many cases large subsystems with very many sensors containing many fields have to be read, as is documented in xep-0000-SN-Concentrators.html xep-0000-SN-Concentrators.html . In such cases, a node may have to be specified using two or perhaps even three ID's: a sourceId identifying the data source controlling the device, a possible cacheType narrowing down the search to a specific kind of node, and the common nodeId. For more information about this, see xep-0000-SN-Concentrators.html.

Note: For cases where the nodeId is sufficient to uniquelly identify the node, it is sufficient to provide this attribute in the request. If there is ambiguity in the request, the receptor must treat the request as a request with a set of nodes, all with the corresponding nodeId as requested.

All timestamps and dateTime values use the XML data type xs:dateTime to specify values. These values include a date, an optional time and an optional time zone.

Note: If time zone is not available, it is supposed to be undefined. The client reading the sensor that reports fields without time zone information should assume the sensor has the same time zone as the client, if not explicitly configured otherwise on the client side.

If devices report time zone, this information should be propagated throughout the system. Otherwise, comparing timestamps from different time zones will be impossible.

This specification allows for localization of field names in meter data read-out. This is performed by assigning each localizable string a String ID which should be unique within a given Language Module. A Language Module can be any string, including URI's or namespace names. The XEP xep-0000-SN-Interoperability details how such localizations can be made in an interoperable way.

Note: Localization of strings are for human consumption only. Machines should use the unlocalized strings in program logic.

The following example shows how a device can report localized field information that can be presented to end users without systems being preprogrammed to recognize the device. Language modules can be aggregated by operators after installation, or installed as a pluggable module after the main installation, if localization is desired.

]]>

The above example defines a language module called Watchamacallit. In this language module it defines four strings, with IDs 1-4. A system might store these as follows, where the system replaces all %N% with a conceptual n:th parameter. (It's up to the system to define these strings, any syntax and how to handle input and output.). In this example, we will assume %0% means any previous output, and %1% any seed value provided. (See below).

ID String
1 Temperature
2 %0%, Min
3 %0%, Max
4 %0%, Mean

So, when the client reads the field name Temperature, Min, it knows that the field name is the composition of the string Temperature, and the string %0%, Min, where it will replace %0% with the output of the previous step, in this case Temperature. These strings can later be localized to different languages by operators of the system, and values presented when reading the device, can be done in a language different from the one used by the sensor.

Note: The XEP xep-0000-SN-Interoperability details how such localizations can be made in an interoperable way.

The stringIds attribute merits some further explanation. The value of this attribute must match the following regular expression:

^\d+([|]\w+([.]\w+)*([|][^,]*)?)?(,\d+([|]\w+([.]\w+)*([|][^,]*)?)?)*$

This basically means, it's of the format: ID_1[|[Module_1][|Seed_1]][...[ID_n[|[Module_n][|Seed_n]]]]

Where brackets [] mean the contents inside is optional, ID_i is an integer representing the string ID in a language module. Module_i is optional and allows for specifying a module for ID_i, if different from the module defined in the module attribute. Seed_i allows for seeding the generation of the localized string with a value. This might come in handy when generating strings like Input 5, where you don't want to create localized strings for every input there is.

Why such a complicated syntax? The reason is the following: Most localized strings are simple numbers, without the need of specifying modules and seeds. This makes it very efficient to store it as an attribute instead of having to create subelements for every localized field. It's an exception to the rule, to need multiple steps or seeds in the generation of localized strings. Therefore, attributes is an efficient means to specify localization. However, in the general case, a single string ID is not sufficient and multiple steps are required, some seeded.

stringIds New Parts Result
1 1="Temperature" Temperature
1,2 2="%0%, Max" Temperature, Max
1,1|MathModule 1 in module "MathModule"="sum(%0%)" sum(Temperature)
3||A1 3="Input %1%" Input A1
4||A1,2 4="Entrance %1%" Entrance A1, Max
4||A1,5||3 5="%0%, Floor %1%" Entrance A1, Floor 3

This document has not touched upon security in sensor networks. There are mainly three concerns that implementers of sensor networks need to consider:

This document requires no interaction with &IANA;.

REQUIRED.

]]>

Thanks to Joachim Lindborg, Karin Forsell, Tina Beckman, Kevin Smith and Tobias Markmann for all valuable feedback.