You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2294 lines
55 KiB

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE xep SYSTEM 'xep.dtd' [
<!ENTITY % ents SYSTEM "xep.ent">
%ents;
]>
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep>
<header>
<title>A Framework For Securing Jabber Conversations</title>
<abstract>
Although the value and utility of contemporary instant messaging
systems, like Jabber, are now indisputable, current security
features to protect message data are generally inadequate for
many deployments; this is particularly true in security conscious
environments like large, commercial enterprises and government
agencies. These current features suffer from issues of
scalability, usability, and supported features. Furthermore, there is a
lack of standardization.
We present a protocol to allow communities of Jabber users to
apply cryptographic protection to selected conversation data.
</abstract>
&LEGALNOTICE;
<number>0031</number>
<status>Deferred</status>
<type>Standards Track</type>
<sig>Standards</sig>
<dependencies/>
<supersedes/>
<supersededby/>
<shortname>N/A</shortname>
<author>
<firstname>Paul</firstname>
<surname>Lloyd</surname>
<email>paul_lloyd@hp.com</email>
<jid>paul_lloyd@jabber.hp.com (private)</jid>
</author>
<revision>
<version>0.2</version>
<date>2002-07-09</date>
<initials>PCL</initials>
<remark>
updated to reflect group consensus to incorporate XML Encryption, as well
as other group comments from Draft 0.9.
</remark>
</revision>
<revision>
<version>0.1</version>
<date>2002-05-07</date>
<initials>
PCL
</initials>
<remark>
initial version
</remark>
</revision>
</header>
<section1 topic="Introduction">
<p>
Instant messaging has clearly crossed the chasm from experimental
to mainstream in a short amount of time. It is particularly
interesting to note the extent to which the employees and
affiliates of large enterprises have adopted instant messaging as
part of their daily professional lives. IM is no longer simply
used on Friday evening to select which movie to watch; it's now
used on Monday morning to select which company to acquire.
</p>
<p>
While the benefits of IM are clear and compelling, the risks
associated with sharing sensitive information in an IM
environment are often overlooked. We need a mechanism that
permits communities of users to protect their IM conversations.
This document presents an extension protocol that can be
incorporated into the existing Jabber protocol to provide such a
mechanism. We hope that this protocol spurs both interest
and further investigation into mechanisms to protect Jabber
conversations. We also hope that the Jabber community can
accelerate the adoption of standardized security mechanisms.
</p>
<p>
In addition to its ability to protect traditional messaging data,
the proposed protocol may also serve as a foundation for securing
other data transported via other Jabber extensions.
</p>
<p>
We use the following terms throughout this document to describe
the most relevant aspects of the IM environment that we wish to
address:
</p>
<ul>
<li>
<p>
user. A user is simply any Jabber user. Users are uniquely
identified by a JID; they connect to Jabber hosts using a
Jabber node.
</p>
<p>
Users produce and consume information, and we wish to
provide them with mechanisms that can be used to protect
this information.
</p>
</li>
<li>
<p>
community. A community is a collection of users who wish to
communicate via Jabber. No restrictions or assumptions are
made about the size of communities or the geographical,
organizational, or national attributes of the members.
Communities are assumed to be dynamic and ad-hoc. Users
typically join communities by the simple act of invitation.
All members of a community are assumed to be peers.
</p>
<p>
The members of communities share information among
themselves, and we wish to provide them with mechanisms
that can permit information to only be shared by community
members.
</p>
</li>
<li>
<p>
conversation. A conversation is the set of messages
that flows among the members of a community via some
network. Conversations consist of both the actual
conversation data produced and consumed by the various
users as well as the Jabber protocol elements that
transport it. Members participate in a conversation when
they are the source or destination of this traffic.
</p>
<p>
In hostile network environments, like the Internet,
conversation data is vulnerable to a variety of well-known
attacks.
</p>
</li>
</ul>
<p>
Other Jabber and IM terms are used in a traditional, intuitive
fashion.
</p>
</section1>
<section1 topic="Requirements And Considerations">
<p>
The proposed protocol is designed to address the specific
requirements and considerations presented in this section.
</p>
<section2 topic="Security Requirements">
<section3 topic="Data Protection Requirements">
<p>
A secure IM system must permit conversation participants to
preserve the following properties of their conversation data:
</p>
<ul>
<li>
<p>
confidentiality. Conversation data must only be disclosed
to authorized recipients
</p>
</li>
<li>
<p>
integrity. Conversation data must not be altered
</p>
</li>
<li>
<p>
data origin authentication. Recipients must be able to
determine the identity of the sender and trust that the
message did, in fact, come from the sender. It is important
to note that this requirement does not include the
requirement of a durable digital signature on conversation
data.
</p>
</li>
<li>
<p>
replay protection. Recipients must be able to detect and
ignore duplicate conversation data.
</p>
</li>
</ul>
<p>
These are established, traditional goals of information security
applied to the conversation data. In the IM environment, these
goals protect against these attacks:
</p>
<ul>
<li>
<p>
eavesdropping, snooping, etc.
</p>
</li>
<li>
<p>
masquerading as a conversation participant
</p>
</li>
<li>
<p>
forging messages
</p>
</li>
</ul>
<p>
Preserving the availability of conversation data is not addressed
by this protocol.
</p>
<p>
Preserving the anonymity of conversation participants is an
interesting topic which we defer for future exploration.
</p>
<p>
Finally, note that this protocol does not concern any authentication
between a Jabber node and a Jabber host.
</p>
</section3>
<section3 topic="Data Classification Requirements">
<p>
A secure IM system must support a data classification feature through the use
of security labeling. Conversation participants must be
able to associate a security label with each piece of
conversation data. This label may be used to specify a data
classification level for the conversation data.
</p>
</section3>
<section3 topic="The End To End Requirement">
<p>
It is easy to imagine Jabber systems in which the servers play
active, fundamental roles in the protection of conversation
data. Such systems could offer many advantages, like:
</p>
<ul>
<li>
<p>
allowing the servers to function as credential issuing
authorities,
</p>
</li>
<li>
<p>
allowing the servers to function as policy enforcement
points.
</p>
</li>
</ul>
<p>
Unfortunately, such systems have significant disadvantages when
one considers the nature of instant messaging:
</p>
<ul>
<li>
<p>
Many servers may be untrusted, public servers.
</p>
</li>
<li>
<p>
In many conversation communities, decisions of trust and
membership can only be adequately defined by the members
themselves.
</p>
</li>
<li>
<p>
In many conversation communities, membership in the
community changes in real time based upon the dynamics of
the conversation.
</p>
</li>
<li>
<p>
In many conversation communities, the data classifaction of
the conversation changes in real time based upon the
dynamics of the conversation.
</p>
</li>
</ul>
<p>
Furthermore, the widespread use of gateways to external IM
systems is a further complication.
</p>
<p>
Based on this analysis, we propose that security be entirely
controlled in an end to end fashion by the conversation
participants themselves via their user agent software.
</p>
</section3>
<section3 topic="Trust Issues">
<p>
We believe that, ultimately, trust decisions are in the hands of
the conversation participants. A security protocol and
appropriate conforming user agents must provide a mechanism for them to make
informed decisions.
</p>
</section3>
<section3 topic="Cryptosystem Design Considerations">
<p>
One of the accepted axioms of security is that people must avoid
the temptation to start from scratch and produce new, untested
algorithms and protocols. History has demonstrated that such
approaches are likely to contain flaws and that considerable time
and effort are required to identify and address all of these
flaws. Any new security protocol should be based on existing,
established algorithms and protocols.
</p>
</section3>
</section2>
<section2 topic="Environmental Considerations">
<p>
Any new IM security protocol must integrate smoothly into the
existing IM environment, and it must also recognize the nature of
the transactions performed by conversation participants. These
considerations are especially important:
</p>
<ul>
<li>
<p>
dynamic communities. The members of a community are defined
in near real time by the existing members.
</p>
</li>
<li>
<p>
dynamic conversations. Conversations may involve any
possible subset of the entire set of community members.
</p>
</li>
</ul>
<p>
Addressing these considerations becomes especially crucial when
selecting a conference keying mechanism.
</p>
</section2>
<section2 topic="Usability Requirements">
<p>
Given the requirement to place the responsibility for the
protection of conversation data in the hands of the participants,
it is imperative to address some fundamental usability issues:
</p>
<ul>
<li>
<p>
First, overall ease of use is a requirement. For protocol
purposes, one implication is that some form of
authentication via passphrases is necessary. While we
recognize that this can have appalling consequences,
especially when we realize that a passphrase may be shared
by all of the community members, we also recognize the
utility.
</p>
</li>
<li>
<p>
PKIs are well established in many large organizations, and
some communities will prefer to rely on credentials issued
from these authorities. To ensure ease of use, we must
strive to allow the use of existing PKI credentials and
trust models rather than impose closed, Jabber-specific
credentials.
</p>
</li>
<li>
<p>
Finally, performance must not be negatively impacted; this
is particularly true if we accept that most communities are
composed of human users conversing in real time. For
protocol purposes, one obvious implication is the desire to
minimize computationally expensive public key operations.
</p>
</li>
</ul>
<p>
We note that, in practice, the design and construction of user
agents will also have a major impact on ease of use.
</p>
</section2>
<section2 topic="Development And Deployment Requirements">
<p>
To successfully integrate into the existing Jabber environment,
an extension protocol for security must satisfy the following:
</p>
<ul>
<li>
<p>
It must be an optional extension of the existing Jabber protocol.
</p>
</li>
<li>
<p>
It must be transparent to existing Jabber servers.
</p>
</li>
<li>
<p>
It must function gracefully in cases where some community
members are not running a user agent that supports the
protocol.
</p>
</li>
<li>
<p>
It must make good use of XML.
</p>
</li>
<li>
<p>
It must avoid encumbered algorithms.
</p>
</li>
<li>
<p>
It must be straightforward to implement using widely
available cryptographic toolkits.
</p>
</li>
<li>
<p>
It must not require a PKI.
</p>
</li>
</ul>
<p>
Failure to accommodate these will impede or prohibit adoption of
any security protocol.
</p>
</section2>
</section1>
<section1 topic="Protocol Specification">
<section2 topic="Protocol Overview">
<p>
Ultimately, conversation data is protected by the application of
keyed cryptographic operations. One operation is used to provide
confidentiality, and a separate operation is used to provide
integrity and data origin authentication. The keys used to
parameterize these operations are called conversation keys. Each
conversation should have its own unique set of conversation keys
shared among the conversation participants.
</p>
<p>
Conversation keys are transported among the conversation
participants within a negotiated security session. A security session allows
pairs of conversation participants to securely share conversation keys
throught all participants in the conversation as required.
</p>
</section2>
<section2 topic="Definitions And Notation">
<p>
The following terms are used throughout this specification:
</p>
<ul>
<li>
<p>
initiator. The initiator is the user who requested a security session
negotiation. Initiator's are identified by their JID.
</p>
</li>
<li>
<p>
responder. The responder is the user who responded to a security session
negotiation request. Responder's are identified by their JID.
</p>
</li>
<li>
<p>
hmac. This indicates the HMAC algorithm. The notation hmac (key, value)
indicates the HMAC computation of value using key.
</p>
</li>
<li>
<p>
concatentation operator. The '|' character is used in character or octet
string expressions to indicate concatenation.
</p>
</li>
<li>
<p>
security session ID. A character string that uniquely identifies a
security session between two users. Security session IDs MUST only
consist of Letters, Digits, and these characters: '.', '+', '-',
'_', '@'. Security session IDs are case sensitive.
</p>
</li>
<li>
<p>
SS. This term indicates the security session secret that is agreed to
during a security session negotiation.
</p>
</li>
<li>
<p>
SKc. This term indicates the keying material used within a security session
to protect confidentiality. The SKc is derived from the security session secret, SS.
</p>
</li>
<li>
<p>
SKi. This term indicates the keying material used within a security session
to protect integrity and to provide authnetication. The SKi is derived from the
security session secret, SS.
</p>
</li>
<li>
<p>
conversation key ID. A character string that uniquely identifies a
conversation key shared by a community of users. Conversation key IDs MUST only
consist of Letters, Digits, and these characters: '.', '+', '-',
'_', '@'. Conversation key IDs are case sensitive. Conversation key IDs SHOULD
be generated from at least 128 random bits.
</p>
</li>
<li>
<p>
passphrase ID. A character string that uniquely identifies a
passphrase shared by a community of users. Passphrase IDs MUST only
consist of Letters, Digits, and these characters: '.', '+', '-',
'_', '@'. Passphrase IDs are case sensitive.
</p>
</li>
</ul>
</section2>
<section2 topic="XML Processing">
<p>
Since cryptographic operations are applied to data that is
transported within an XML stream, the protocol defines a set of
rules to ensure a consistent interpretation by all conversation
participants.
</p>
<section3 topic="Transporting Binary Content">
<p>
Binary data, such as the result of an HMAC, is always transported
in an encoded form; the two supported encoding schemes are base64
and hex.
</p>
<p>
Senders MAY include arbitrary white space within the character
stream. Senders SHOULD NOT include any other characters outside
of the encoding set.
</p>
<p>
Receivers MUST ignore all characters not in the encoding set.
</p>
</section3>
<section3 topic="Transporting Encrypted Content">
<p>
Encrypted data, including wrapped cryptographic keys, are always
wrapped per XML Encryption.
</p>
</section3>
<section3 topic="HMAC Computation">
<p>
HMACs are computed over a specific collection of attribute values
and character data; when computing an HMAC the following rules
apply:
</p>
<ul>
<li>
<p>
All characters MUST be encoded in UTF-8.
</p>
</li>
<li>
<p>
The octets in each character MUST be processed in network
byte order.
</p>
</li>
<li>
<p>
For a given element, the attribute values that are HMACed
MUST be processed in the specified order regardless of the
order in which they appear in the element tag.
</p>
</li>
<li>
<p>
For each attribute value, the computation MUST only include
characters from the anticipated set defined in this
specification; in particular, white space MUST always be
ignored.
</p>
</li>
<li>
<p>
For character data that is represented in an encoded form,
such as base64 or hex, the computation MUST only include
valid characters from the encoding set.
</p>
</li>
</ul>
</section3>
<section3 topic="Performing Cryptographic Operations">
<p>
The following algorithm is used to encrypt a character string, such as
an XML element:
</p>
<ul>
<li>
<p>
The character string MUST be encoded in UTF-8.
</p>
</li>
<li>
<p>
The octets in each character MUST be processed in network byte order.
</p>
</li>
<li>
<p>
Appropriate cryptographic algorithm parameters, such as an
IV for a block cipher, are generated.
</p>
</li>
</ul>
</section3>
</section2>
<section2 topic="XML Namespaces">
<p>
In order to integrate smoothly with the existing Jabber protocol,
this protocol utilizes a new XML namespace, jabber:security.
</p>
</section2>
<section2 topic="Security Sessions">
<section3 topic="Overview">
<p>
A security session is a pair-wise relationship between two users
in which the users have achieved the following:
</p>
<ul>
<li>
<p>
They have mutually authenticated each other using credentials acceptable to both.
</p>
</li>
<li>
<p>
They have agreed on a set of key material known only to both.
</p>
</li>
</ul>
<p>
Security sessions are identified by a 3-tuple consisting of the following items:
</p>
<ul>
<li>
<p>
initiator. This is the JID of the user who initiated the session.
</p>
</li>
<li>
<p>
responder. This is the JID of the user who responded to the initiator's request.
</p>
</li>
<li>
<p>
sessionId. A label generated by the initiator.
</p>
</li>
</ul>
<p>
Security sessions are used to transport conversation keys between the conversation participants.
</p>
<p>
Scalabilty is an immediate, obvious concern with such an approach. We expect this
approach to be viable in practice because:
</p>
<ul>
<li>
<p>
The number of participants in typical, interactive conversations is generally on the order of 10^1.
</p>
</li>
<li>
<p>
New participants are usually invited to dynamically join a
conversation by being invited by an existing participant;
this existing participant is the only one who needs to
establish a security session with the new participant,
because this single security session can be used to
transport all of the required conversation keys.
</p>
</li>
<li>
<p>
User agents can permit the lifetime of security sessions to
last long enough to allow transport of conversation keys
for a variety of converstions.
</p>
</li>
<li>
<p>
Conversation keys can be established with a suitable lifetime.
</p>
</li>
</ul>
<p>
Other approaches, including the incorporation of more
sophisticated conference keying algorithms, are a topic for
future exploration.
</p>
</section3>
<section3 topic="Security Session Negotiation">
<p>
Security sessions are negotiated using an authenticated Diffie-Hellman key agreement
exchange. The two goals of the exchange are to perform the mutual authentication
and to agree to a secret that is know only to each.
</p>
<p>
The exchange also allows the parties to negotiate the various algorithms
and authentication mechanisms that will be used.
</p>
<p>
Once the pair agree on a shared secret, they each derive key material from the
secret; this key material is used to securely transport the conversation keys,
which are used to actually protect conversation data.
</p>
<p>
The protocol data units (PDUs) that comprise the exchange are transported
within existing Jabber protocol elements.
</p>
</section3>
<section3 topic="DTDs">
<example>
&lt;!ELEMENT session1
(nonce, keyAgreement, algorithms, authnMethods) &gt;
&lt;!ATTLIST session1
version CDATA #REQUIRED
initiator CDATA #REQUIRED
responder CDATA #REQUIRED
sessionId CDATA #REQUIRED
hmac (hmac-sha1) #REQUIRED &gt;
&lt;!ELEMENT nonce
(#PCDATA)* &gt;
&lt;!ATTLIST nonce
encoding (base64 | hex) #REQUIRED &gt;
&lt;!ELEMENT keyAgreement
(dh) &gt;
&lt;!ELEMENT dh
(publicKey) &gt;
&lt;!ATTLIST dh
group (modp1024 | modp2048 | modp4096 | modp8192) #REQUIRED &gt;
&lt;!ELEMENT publicKey
(#PCDATA)* &gt;
&lt;!ATTLIST publicKey
encoding (base64 | hex) #REQUIRED &gt;
&lt;!ELEMENT algorithms
(algorithm)+ &gt;
&lt;!ELEMENT algorithm
(confAlg, hmacAlg) &gt;
&lt;!ELEMENT confAlg EMPTY &gt;
&lt;!ATTLIST confAlg
cipher (3des-cbc | aes-128-cbc | aes-256-cbc) #REQUIRED &gt;
&lt;!ELEMENT hmacAlg EMPTY &gt;
&lt;!ATTLIST hmacAlg
alg (hmac-sha1 | hmac-md5) #REQUIRED&gt;
&lt;!ELEMENT authnMethods
(authnMethod)+ &gt;
&lt;!ELEMENT authnMethod
(digSig | passphrase) &gt;
&lt;!ELEMENT digSig
(certificate+, caCertificate*) &gt;
&lt;!ATTLIST digSig
alg (rsa) #REQUIRED&gt;
&lt;!ELEMENT certificate
(#PCDATA)* &gt;
&lt;!ATTLIST certificate
type (x509 | pkcs7) #REQUIRED
encoding (base64 | hex) #REQUIRED &gt;
&lt;!ELEMENT caCertificate
(#PCDATA)* &gt;
&lt;!ATTLIST caCertificate
type (x509 | pkcs7) #REQUIRED
encoding (base64 | hex) #REQUIRED &gt;
&lt;!ELEMENT passphrase EMPTY &gt;
&lt;!ATTLIST passphrase
passphraseId CDATA #REQUIRED &gt;
&lt;!ELEMENT session2
(nonce, keyAgreement, algorithm, authnMethod, authenticator) &gt;
&lt;!ATTLIST session2
version CDATA #REQUIRED
initiator CDATA #REQUIRED
responder CDATA #REQUIRED
sessionId CDATA #REQUIRED
hmac (hmac-sha1) #REQUIRED &gt;
&lt;!ELEMENT authenticator
(#PCDATA)* &gt;
&lt;!ATTLIST authenticator
encoding (base64 | hex) #REQUIRED&gt;
&lt;!ELEMENT session3
(authenticator, keyTransport*) &gt;
&lt;!ATTLIST session3
version CDATA #REQUIRED
initiator CDATA #REQUIRED
responder CDATA #REQUIRED
sessionId CDATA #REQUIRED
hmac (hmac-sha1) #REQUIRED &gt;
</example>
</section3>
<section3 topic="Generating And Sending the session1 PDU">
<p>
The initiator's user agent employs the following algorithm to generate the session1 PDU:
</p>
<ul>
<li>
<p>
Appropriate values for the version, initiator, responder,
sessionId, and hmac attributes are assembled. The version of
this specification is '1.0'. The values of initiator and
responder MUST be the JIDs of the two participants,
respectively.
</p>
</li>
<li>
<p>
The nonce is prepared by first generating a string of 20
random octets (160 random bits). The octets are then
encoded into a string of 40 hex characters representing the
random string.
</p>
</li>
<li>
<p>
A Diffie-Hellman group is selected. The appropriate values
for g and p will be used to generate the initiator's public
key.
</p>
</li>
<li>
<p>
An ephemeral private key, x, is generated using g and p
for the selected group. This key MUST be generated using an
appropriate random number source. The corresponding public
key, g^x, is generated and encoded.
</p>
</li>
<li>
<p>
The desired set of confidentiality and HMAC cryptographic
algorithms is selected. The manner in which these
algorithms are selected and all related policy issues are
outside the scope of this specification.
</p>
</li>
<li>
<p>
The desired set of authentication algorithms is selected.
The manner in which these algorithms are selected and all
related policy issues are outside the scope of this
specification. When the digital signature form of
authentication is selected, the relevant end-entity
certificate and, optionally, a chain of CA certificates
representing a validation path, is assembled and encoded. A
set of trusted CA certificates MAY optionally be included
via caCertificate elements; if so, the set MUST include the
issuer of the initiator's end-entity certificate.
</p>
</li>
</ul>
<p>
These values are then used to prepare the XML session1 element;
this element is transmitted via the existing Jabber iq mechanism:
</p>
<example>
&lt;iq from="initiator's JID" to="responder's JID" type="get" id="whatever"&gt;
&lt;query xmlns="jabber:security:session"&gt;
&lt;session1&gt;...&lt;/session1&gt;
&lt;/query&gt;
&lt;/iq&gt;
</example>
</section3>
<section3 topic="Receiving And Processing the session1 PDU">
<p>
The responder's user agent employs the following algorithm to process each session1 PDU:
</p>
<ul>
<li>
<p>
The version and hmac attributes are checked against the
values supported by the user agent. An unsupported version
results in an error code of 10000, and an unsupported hmac
results in an error code of 10001. The responder attribute MUST
match the JID of the receiver; a mismatch results in an error code of 10009
</p>
</li>
<li>
<p>
The nonce is decoded, and its length is checked. The nonce
may also be checked to detect replays. An invalid nonce
results in an error code of 10002.
</p>
</li>
<li>
<p>
The Diffie-Hellman group is checked against the values
supported by the user agent. An unsupported group results
in an error code of 10003
</p>
</li>
<li>
<p>
The desired confidentiality and HMAC cryptographic
algorithms are selected from the proposed set. The manner
in which these algorithms are selected and all related
policy issues are outside the scope of this specification.
If none of the proposed algorithms are supported, an error
code of 10004 occurs.
</p>
</li>
<li>
<p>
The desired authentication algorithm is selected from the
proposed set. The manner in which this algorithm is
selected and all related policy issues are outside the