Friday, June 7, 2013

WebRTC: Security and Confidentiality

One of the interesting aspects of WebRTC is that it has encryption baked right into it; there's actually no way to send unencrypted media using a WebRTC implementation. The developing specifications currently use DTLS-SRTP keying[1], and that's what both Chrome and Firefox implement. The general idea here is that there's a Diffie-Hellman key exchange in the media channel, without the web site -- or even the javascript implementation -- being involved at all.

But who are you talking to?

This is only part of the story, though. While encryption is one of the tools necessary to prevent eavesdropping by other parties, it is in no way sufficient. Unless you have some way to demonstrate that the other end of your encrypted connection is under the control of the person you're talking to, you could easy be sending your media to a server that is capable of sharing your conversation with an arbitrary third party. One important tool to help with this problem is the WebRTC identity work currently underway in the W3C. This isn't ready in any implementations that I'm aware of yet, but it's definitely something that needs to happen before we consider WebRTC done.

The general idea behind the identity work is that, as part of key exchange, you also get enough information to prove, in a cryptographically verifiable way, that the other end of the connection are who you think they are. Of course, there are still some tricky aspects to this (you have to, for example, trust Google not to sign off on someone other than me being ""[2]), but you can at least reduce the problem from trusting one party (the website hosting the WebRTC application) to trusting that two parties (the website and the identity provider) won't collude.

The other tool necessary to ensure the confidentiality of contents is making sure that the media isn’t being copied by the javascript itself and being sent to an alternate destination. This isn’t part of any current specification, but we’re working on adding a standardized mechanism that will allow specific user media streams to be limited so that they can only be sent, over an encrypted channel, to a specified identity (and nowhere else).

On top of this, web browser developers have a very difficult task in presenting this to users in a way that they can use. The nuances between (and implications of) "this is encrypted but we can't prove who you're talking to" versus "this is being encrypted and sent directly to (at least if you trust Google)" are very subtle. Rendering this to users is a thorny challenge, and one that's going to take time to get right.

And who knows who you are talking to?

Of course, none of this is perfect. The recent Verizon brouhaha is about a database of who is communicating with whom (known in the communications interception community as a "pen register"), not actually listening in on phone calls. It uses telephone numbers as identifiers, which are pretty easy to correlate to an owner. WebRTC can't prevent this kind of information from being collected,  using IP addresses where Verizon uses phone numbers. IP addresses aren't much harder to correlate to people than phone numbers are, as has been demonstrated by numerous MPAA and RIAA lawsuits.

Even with a good encryption story, WebRTC has no inbuilt defenses to collecting this kind of information. Anyone with access to the session description is going to be able to see the IP addresses of both parties to the conversation; and, of course, the website is going to know where the HTTP requests came from. Beyond that, your ISP (and every backbone provider between you and the other end of the call) can easily see which IP addresses you're sending information to, and picking media streams out (even encrypted media streams) is a trivial exercise for the kinds of equipment ISPs and backbone providers deploy.

The problem  is that it's fundamentally difficult to mask who is talking to whom on a network. There are approaches, such as anonymizers and Onion Routers, that can be used to make it more difficult to ascertain; but such approaches have their own weaknesses, and most simply shift trust around from one third party to another.

In summary, WebRTC is taking steps to allow for the contents of communication to remain confidential, but it takes a concerted effort by application developers to bring the right tools together. The less tractable problem of masking who talks to whom is left as out of scope.

[1] There's been recent talk in the IETF RTCWEB working group of adding Security Descriptions (SDES) as an alternate means of key exchange. SDES uses the signaling channel to send the media encryption keys from one end of the connection to the other. This would necessarily allow the web site to access the encryption keys. This means that they (or anyone they handed the keys off to) could decrypt the media, if they have access to it. In terms of stopping some random hacker in the same hotel as you from listening in while you talk to your bank, it's still reasonably effective; in the context of programs like PRISM, or even the pervasive collection of personal data by major internet website operators, it's about as much protection as using tissue paper to stop a bullet.

[2] Whether you choose to do so mostly comes down to whether you trust this blog entry more than this slide.