logo~stef/blog/

Reversing sha256ed MACs

2022-07-29

I was watching Dave Borghuis' fine talk: How I made the municipality pay a 600.000 euro fine for invading your privacy.

But before I start with the topic of this post, let me say that Dave is more than a "Privacy activist / worried & critical citizen", he is in fact the shining light in the Dutch "hacker"-scene, a true role-model. He takes responsibility, protects privacy and approaches not only technology but also authority with mistrust. Latter being an important point in traditional hacker ethics, which I believe we need a lot more (besides people like Dave) in the Netherlands.

But let us dive into the topic of this post. The Dutch city of Enschede has been fined 600 thousand euros by the Dutch data-protection agency for tracking citizens with no proportional purpose or legitimate interest for this data collection/processing. They use the CityTraffic method developed by the RMC consulting company.

The CityTraffic method collects WiFi MAC addresses in beacons hashes them with sha256 and truncates it first to the first 12 hex digits of the hash, and after 24 hours drops 3 more hex digits from the end of the hash, keeping only 9 digits.

A traditional MAC address has two parts:

  • the Organizationally Unique Identifier (OUI): 24-bit number in networking equipment that uniquely identifies its seller/producer.
  • the remaining 24 bits are assigned by the vendor identified by the OUI to uniquely identify the device.

If you download the list of OUIs from the many places on the internetz, you will find that there is currently about 46 thousand entries in that list (46302 in my copy). This is about ~15.49 (log_2(46302))bit of information, combined with the 24 bits of unique device id that brings us to maximum ~39.5 bits of entropy.

At the end of the talk I used the chance to ask Dave if he considered that these truncated hashes can indeed be reversed by using a pre-computed rainbow table? I even postulated that it might be possible to "reverse" the hashes with GPU/FPGA assisted password crackers like hashcat or john the ripper. Today I tried it out on an Nvidia Jetson AGX Xavier, a small development board that comes with 512 NVIDIA CUDA cores for machine learning - or password cracking. This is a low-power board that consumes maybe up to 30W or so, nothing really powerful.

Session..........: hashcat
Status...........: Cracked
Hash.Mode........: 1400 (SHA2-256)
Hash.Target......: 08..47
Time.Started.....: Wed Jul 27 12:51:16 2022 (12 mins, 51 secs)
Time.Estimated...: Wed Jul 27 13:04:07 2022 (0 secs)
Kernel.Feature...: Optimized Kernel
Guess.Base.......: File (/home/user/OUIs), Left Side
Guess.Mod........: Mask (:?1?1:?1?1:?1?1) [9], Right Side
Guess.Charset....: -1 ?dABCDEF, -2 Undefined, -3 Undefined, -4 Undefined
Speed.#1.........:   581.2 MH/s (7.02ms) @ Accel:8 Loops:128 Thr:512 Vec:1
Recovered........: 1/1 (100.00%) Digests
Progress.........: 448010387456/776818655232 (57.67%)
Rejected.........: 0/448010387456 (0.00%)
Restore.Point....: 0/46302 (0.00%)
Restore.Sub.#1...: Salt:0 Amplifier:13672064-13672192 Iteration:0-128
Candidate.Engine.: Device Generator
Candidates.#1....: 00:00:00:D4:B4:1C -> 74:88:8B:E9:B4:1C

Started: Wed Jul 27 12:51:13 2022
Stopped: Wed Jul 27 13:04:08 2022

It took it 12 minutes to recover the MAC address I hashed using sha256, hashcat estimated a total of 20 minutes to test the whole 39 bits of keyspace on this setup. It turns out no rainbow-tables are necessary, anyone can revert the hashes with a few hundred bucks and negligible amount of electric power.

Running hashcat on a full database would take about 20 minutes, it would just continuously spew out recovered MACs as it iterates through the keyspace and finds matching hashes in the DB. This would make the attack much cheaper per MAC. It must be noted that this works on full hashes, not truncated hashes, for truncated hashes hashcat needs to be extended for this new mode, but that should be a simple exercise for the interested party.

However we need to consider that there might be collisions in the database. Lets examine this.

Collisions are when two different input values both compute to the same output result, Normally with sha256 this is computationally impossible, but when you truncate the hash value, this gets easier and easier. Collisions mean that the ids stored by the city/operator of this system would not able to uniquely identify a user, and thus provide (some negligible) anonymity.

Comparing the ~39.5 bits of input entropy with the 12*4 (= 48) bits of output entropy for the hashes stored for 24 hours means that every MAC has with high probability a unique 48 bit truncated 24 hour hash. However older than 24 hour hashes have only 36 bits, which means that each 36 bit hash maps back to ~ 11.3 MAC addresses. This sounds nice, but there is a few concerns:

  1. most valid MAC addresses have a OUI that are clearly not phones.
  2. Even if it is ambiguous which MAC address a hash belongs to, the context like location, time, and possibly other metadata collected, will still uniquely identify a person and by doing so can also recover the MAC address if that is necessary.

But we should also consider the Birthday Paradox, or something similar. Based on the above, the chance that two random MACs hash to p(different) = Π_i^n (1 - (10*i)/2^39.5) the same 36 bit hash is about ((11.3 - 1) / 2^39.5). For n random MAC addresses the probability that they all hash to different hashes can be seen in the expression to the right.

To have a 50% chance to have one collision we need 232 129 random MAC addresses. If we have only 90 000 MAC addresses there is still a ~ 10% chance for a collision and with 27 000 MACS we still get ~ 1% of a chance for collisions, in the other direction we need 560 000 MACs to have a ~ 99% chance of a collision.

Looking at the probabilities in a vacuum looks like there is a chance for some collisions and those lucky will have some anonymity, although the anonymity set is extremely small. However these hashes don't exist in a vacuum, they come with extra metadata which makes re-identification possible despite collisions.

Summing up all of the above we can conclude that this system essentially replaces a public identifier - the MAC - with a secret and still mostly unique identifier. To quote the GDPR Article 4 point 1:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number ...

So hashing does not change anything from the perspective of the GDPR when it comes to the MAC.

This is whole CityTraffic debacle is just yet another example of (Paul) Ohm's law:

No useful database can ever be perfectly anonymous, and as the utility of data increases, the privacy decreases.

By the way some friends (who are more into phones than me) told me that modern smartphones actually randomize their MAC addresses quite rigorously - I know that for Bluetooth they actually do, but I didn't know that they do for WiFi. MAC randomization exists as default by opt-out since 10 years from IPhone 5 upwards, and Android introduced MAC randomization in 2019. So maybe this problem will quickly solve itself, and the surveillance industrial-complex will have to resort to other - even creepier - methods like cameras and facial recognition to track us in cities.


On OPAQUE, SASL, Google, browser-extensions, privacy and security

2022-06-02

You are probably aware that I have been blessed with a grant by the NLNET foundation from the European Commission NGI0 programme to work on OPAQUE and it's integration into various tools. If not, now you have an explanation for all these posts lately. ;)

In my previous post I mentioned that for other protocols like XMPP, IMAP, POP3, etc OPAQUE would make a lot of sense. Each of these supports SASL as a generic authentication abstraction. Implementing an OPAQUE mechanism for SASL would immediately allow all these to benefit from the security guarantees provided by OPAQUE.

What's more, the SL in SASL stands for security layer, basically an encrypted channel. So with SASL connections could choose to use an OPAQUE negotiated key for secure channel construction instead of TLS. This is brilliant, two of OPAQUEs uses (authentication and secure channels) could be deployed in a lot of places with just one mech (and the SL) being implemented.

My last milestone to complete was HTTP server and client support for OPAQUE: apache, nginx, firefox and chrome. In my previous post I also mentioned that HTTP is not quite so cool with OPAQUE. I tried to weasel out of this, negotiating about this with my NLNET contacts, instead they mentioned another NLNET project which implements SASL for HTTP authentication (and even specifies an RFC draft) for apache and a java-based and a c++/qt-based variant for firefox.

Implementing a SASL mechanism achieves the widest possible coverage for OPAQUE, and thanks to the ARPA2 project it also provides cheap solutions for half of the HTTP milestone. Only nginx and chrome support for SASL would need to be implemented. So there it was, an OPAQUE SASL mechanism that worked even between firefox and apache2.

The SASL OPAQUE mechanism does not implement any secure-layer though, this is work for the future. This probably means finding or cooking up an libsodium based protocol to handle a secure channel. I looked at the existing SL protocols supported by SASL and I have to say I was underwhelmed, it felt like the 90ies called and wanted their DES back. But then I guess most protocols nowadays just use TLS underneath. Which is fine, but it defeats one benefit of using OPAQUE.

The next step was to also have nginx supported. Module development for nginx is not very well documented. I had to do a lot of trial-and-erroring how to attach variables to a request that survives an internal redirect. Variables are just what their names suggest, in the nginx config you can use those to add headers or to pass variables to fastcgi or uwsgi backends. The problem was to figure out how to keep these variable values while there is an internal redirect, which happens for example when someone tries to get "http://example.com/" which gets first authenticated with SASL, and then nginx redirects this to "http://example.com/index.html". In the end it all worked out.

Another thing I figured out (and this also applies to the apache module) is that due to SASL being stateful this whole authentication process only works, if all requests are handled by the same worker, the one in which the SASL state is stored. Using shared memory seems not a good idea, since the struct that needs to be shared contains a bunch of function pointers. If you have your webserver set up with seperate forked processes then these cannot share the SASL state and things will go wrong. But it seems that most browsers do keep-alive sessions, and the TCP connection is reused and thus the same worker is getting all the requests. However if your setup involves a load-balancer with seperate nginx servers as backends you will have trouble doing SASL auth, unless you terminate the authentication already at the load-balancer. Or use only SASL mechs which have no state, but those are much less secure than OPAQUE.

Another issue with OPAQUE authentication in HTTP is that if every request needs to be authenticated, then either the user types their username and password repeatedly, for most average webpages this means dozens of times. Or we need to cache the username/password somewhere, this sounds like a horrible opportunity to leak these credentials. This also means, that every request for authenticated resources involves 2 or 3 HTTP round-trips for each execution of OPAQUE. To solve this, there are at least two options, either implement channel binding in HTTP (which seems to exist for Microsoft IIS and is supported by all browsers) or to use the shared secret calculated by the OPAQUE execution in an ephemeral HOTP style protocol:

Authorize: SASL mech="HOTP",c2s="base64(username || hmac(kdf(shared_secret,dst), counter++))"

which would enable 0 round-trip re-authentication. This could be another SASL mechanism which could also be useful for other multi-round-trip SASL mechanisms. Specification and implementation of this is also considered future work.

So far this is all quite cool. I now can do OPAQUE authentication in a bunch of protocols even in HTTP on apache and nginx and firefox. But I had one last big puzzle piece missing, chrome and derivatives. Also I started to have this nagging thought that I should also support CLI tools like wget and curl, but they were out of scope for the NGI0 grant, but still, it would be nice.

For the last part - implementing a SASL auth browser extension for chrome -, I was quite optimistic^Wfoolish. I thought it will be easy to port the firefox add-on to chrome, since firefox adopted this google web-extension framework to make add-ons easy to port to all browsers. I hoped I just have to make a few small changes and everything will be dandy. I was so wrong!

Google is pushing for something they call "Manifest V3" (MV3) which is restricting the capabilities of web-extensions:

Manifest V3 is part of a shift in the philosophy behind how we approach end-user security and privacy.

So google is changing their approach end-user privacy? I'm surprised, that would mean a change in their business model.

Most importantly the blocking webrequest API will be restricted in MV3, for privacy reasons, surely no other reason:

Privacy: This requires excessive access to user data, because extensions need to read each network request made for the user.

Translated: google still gets access to every request made, but their supposed competition of other privacy invading add-ons (is that even a thing?) will not. It's a very unfortunate side-effect that those add-ons that actually protect users privacy, and cut into the profits of google also suffer. If this would be about protecting users privacy from misbehaving add-ons, then there is another solution. Google already control who is allowed in their app-store, so they could actually just remove all the extensions that violate the privacy of users, and those extensions that protect it by intercepting all requests could still be supported. This is purely a monopolistic profit oriented move, nothing else.

But it goes on:

The blocking version of the webRequest API is restricted to force-installed extensions in Manifest V3.

So there is a loophole: force-installed extensions, what are those?

As enterprises adopt Chrome Browser and Chrome OS, they often require added controls and configurations to meet their productivity and security needs. This can be achieved through the management of Chrome Enterprise policies. Chrome Enterprise policies give IT admins the power to configure Chrome for their organization or business. You can manage browsers on-premise for Windows or Mac/Linux, or manage browsers for all desktop platforms using Chrome Browser Cloud Management.

These policies are strictly intended to be used to configure instances of Google Chrome internal to your organization. Use of these policies outside of your organization (for example, in a publicly distributed program) is considered malware and will likely be labeled as malware by Google and anti-virus vendors. If Chrome detects that devices are configured with enterprise policies, it will show a message informing end users that their device or browser is being managed by the organization.

On Microsoft® Windows® instances, apps and extensions from outside the Chrome Web Store can only be forced installed if the instance is joined to a Microsoft® Active Directory® domain, running on Windows 10 Pro, or enrolled in Chrome Browser Cloud Management.

On macOS instances, apps and extensions from outside the Chrome Web Store can only be force installed if the instance is managed via MDM, or joined to a domain via MCX.

Employees have no privacy, companies want to inspect and control all communications and for that they still need blocking webrequests, and google is actively supporting this.

Anyway, Google being Google/Evil (or as I've recently taken to calling it, Google plus Evil), it was still an interesting loophole to explore. On linux an enterprise managed policy is under

/etc/{chrome|chromium}/policies/managed/managed_policies.json

and it must contain a stanza for a force-installed add-on and an optional appstore update.xml url. Luckily urls can also have a file: schema. So this is a valid managed_policies.json:

{
  "ExtensionSettings": {
    "nolflnhkeekhijfpdmhkehplikdkpjmh": {
      "installation_mode": "force_installed",
      "update_url": "file:///tmp/updates.xml"
    }
  }
}

The above example would try to automatically install the add-on which has an id: "nolflnhkeekhijfpdmhkehplikdkpjmh" by reading the update.xml which could look like this:

<?xml version='1.0' encoding='UTF-8'?>
<gupdate xmlns='http://www.google.com/update2/response' protocol='2.0'>
  <app appid='nolflnhkeekhijfpdmhkehplikdkpjmh'>
     <updatecheck codebase='file://tmp/http-sasl-plugin-chrome.crx' version='0.1' />
  </app>
</gupdate>

Look ma', only file: schema urls, no webserver needed. This is actually nifty. It allows to package webextensions in Linux distros, that auto-install and are able to use the webrequest API that is otherwise blocked in consumer-grade chrome browsers.

Of course this would only benefit Linux users, the cyber-proletariat of Microsoft or Apple users would have to clear more barriers, running their own AD or MDM/MCX whatever these latter are. So theoretically this is interesting, but in practice useless. Also I want to rise with my digital proletariat comrades not from them.

I also learned from the original firefox HTTP SASL extension developer Henri, that:

The main reason only Firefox is supported is because Firefox is the only browser allowing for the callback function parameter to return a Promise resolving to a BlockingResponse...

Since it seems that getting SASL HTTP auth into chrome is a very expensive project, I had to look for an alternate solution. And I also still had this nagging thought that I should also support curl and wget at least. I was talking to my good friend asciimoo, and over the last few years he was a couple of times suggesting to implement a privacy/security proxy for browsers. Such a proxy would not be dependent on the whims of monopolistic kraakens and neither on their has-been monopol-buster accomplices waging a war on their users. I like that idea of a privacy/security proxy, and I'm very much looking forward to deploy his proxy whenever he gets around to implement one. (btw there are such a things already: privoxy and a newer privaxy)

And thus I decided: (deep voice) this is the way. It is quite easy to write simple plugins for mitmproxy. In a few hours I had a working mitmproxy add-on which enabled any HTTP client to authenticate using SASL and OPAQUE. Not just curl and wget, but also chrome and its derivatives, w3m, and all the other strange web client beasts out there.

The nice thing about mitmproxy is that it quite elegantly solves the matter of installing a MitM TLS certificate authority cert in your browser so that also HTTPS connections can be authenticated with SASL. I don't believe this is a big deal, since the proxy has to run on the same host as your browser anyway because when authenticating it throws up a dialog asking for a user name and password. Thus the TLS connection is terminated in the same host, just not in the browser anymore.

Another cool thing about this HTTP SASL OPAQUE authenticating proxy is that this essentially eliminates phishing attacks as long as the password entry dialog can be clearly distinguished from browser content.

There is one drawback though, with browsers when they throw up a HTTP authentication window, they tend to switch to the tab which requested this. With an (unauthenticated?) proxy you have no clue which program or tab initiated a connection and wants you to authenticate the request.

I guess users deploying personal privacy proxies MitM proxies to reclaim their power over blocking javascript and the advertising maffia, while installing their own MitM CA cert in chrome was not what Google anticipated when they decided to lock down the webrequest API and instead provide this declarativeNetRequest travesty under the sarcastic excuse of changing their attitude towards their users privacy and security.

Anyway, albeit there are a few loose ends - like specifying and implementing the secure layer, and the ephemeral HOTP SASL mechanism -, this more or less concludes my work on libopaque and its support for various languages and protocols. There's gonna be some polish added here or there, i'm gonna track any changes in the RFC draft, but in general libopaque will switch to maintenance mode.

Shouts, greets and kudos to Henri, Rick, asciimoo and dnet for their input for this last milestone.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


Running around with an OPAQUE hammer

2022-02-28

It seems I can't stop blogging about OPAQUE. In our edition today, I'm gonna look at some widely and less widely used protocols if they indeed are nails or only look like nails to me and my OPAQUE hammer...

SSH

The first on my list is SSH. For those of you who don't know, SSH has three "layers":

  1. The transport layer, this connects to the server, verifies TOFU the server key, and establishes a secure channel between the client and the server.
  2. The authentication protocol which runs over the transport layer and authenticates a user.
  3. The connection protocol that handles connections for the authenticated user.

As you can see these are building upon each other. And this limits the utility of OPAQUE, which would shine if it could take over the first and second layers, authenticating both parties and setting up an secure transport between them.

Since it is not possible to handle both the transport layer and the authentication protocol, it means OPAQUE can only be used in the latter to authenticate the user, with a password. The nice thing is, the password is never exposed to the server. However password leaks from SSH servers are not really a thing. A thing however is these annoying credential stuffing attacks, which are best mitigated by using asymmetric key-pairs, or high entropy passwords coming out of a password storage like SPHINX. So doing OPAQUE for authentication in SSH is possible, it is however questionable what the benefits of this are.

Is SSH a nail for my OPAQUE hammer? Nah, not really.

HTTP

Now the other obvious choice for applying OPAQUE is HTTP authentication. Just like with SSH the encryption of HTTP is handled in a layer below, by TLS. So setuping the secure transport channel and authentication of the user again is not gonna fly. With HTTP hower authentication using OPAQUE it is very much possible. The fact that the users password is never sent to or seen by the server is a good defense-in-depth in case the underlying TLS fails in some way. But there is one wrinkle, after the initial and explicit user authentication follow-up requests should also be "automatically" authenticated without asking for (or caching) the users password on the client for every request. A simple solution for this is to have a counter at both the server and the client that is HMACed using the shared session secret and then used as an authentication token. It avoids replay attacks. But as said it introduces state at both ends and concurrency can cause the counters to desynchronize. By the way this is similar to what the SCRAM HTTP authentication RfC proposes.

Another problem in general with HTTP authentication is that contemporary browsers make it easy to phish even HTTP authentication with fake password popups, and OPAQUE cannot protect against such attacks without magic. The only way currently to protect against phishing is to use a native extension that triggers an external process that cannot be reasonably imitated by a full-screen HTML phishing page. This is nice, since then the password never ever even touches the browser. After successful authentication a webextension could always add an authentication header with the "HMAC(counter,session_key)". Hopefully all concurrent requests are also sent out in the same order, otherwise the server and the web-extension can desynchronize, which is not good.

Is HTTP a nail for my (lib)OPAQUE hammer? If you squint real hard and ignore that you are driving the nail into the shifting sand of web browsers, maybe.

SHADOWSOCKS

Shadowsocks is a bit obscure, its a protocol to circumvent governmental firewalls. I have looked at shadowsocks briefly a few years back, and - I must say -, I think it is an awful protocol. Plugging in OPAQUE would definitely improve it significantly. The nice thing is that it can be used for both authentication and setting up the secure channel. I hope someone picks this up and improves the current state of Shadowsocks.

Is shadowsocks a nail? Definitely.

Others

Other obvious candidates are protocols like XMPP, IMAP, POP3, they are mostly all password based and OPAQUE support really "only" needs server support and client support. Of course for bloated mail clients like thunderbird this might be a bit more work to integrate, but basic mail retrieval agents like fdm or getmail this could be quite simple. As long as replacing TLS with OPAQUE+something else counts as simple. Similarly to HTTP these protcols all are TLS based, if that doesn't get replaced then OPAQUE will only be useful for user authentication.

These protocols are similar nail-like non-nails like HTTP, but at least the clients and servers are much easier when it comes to support different kind of authentication.

TLS

Well adding OPAQUE to TLS would be awesome, there was an attempt but it seems it has expired. Having OPAQUE in TLS would mean to be able to setup TLS connections without using the whole CA/Cert infrastructure. If passwords would come out of something like SPHINX that would be at least as secure as the current cert-based TLS.

Is TLS a nail for my OPAQUE hammer? Hell yeah it is!

Summary

However versatile OPAQUE is, existing protocols do not lend themselves easily to use OPAQUE most efficiently with them. Most useful would indeed be a TLS OPAQUE extension but that seems to be forgotten and lost. For a half-hearted solution OPAQUE can still be used to do user authentication, which has its benefits but adaptation greatly depends on servers and clients supporting this authentication mode. OPAQUE is certainly a nice hammer, too bad that most nails are already bent...


All OPAQUE and SPHINX related posts

2022-02-24

Over the last few weeks I have been writing a bunch of blog posts on my implementation of the SPHINX password storage SPHINX and the OPAQUE protocol. Below a list of all these and other related links:

SPHINX

The main post was followed by an attempt to explain how SPHINX security is better than the rest of existing solutions.

OPAQUE

The main post was followed by show-casing three use-cases for OPAQUE:

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


How to recover static secrets using OPAQUE

2022-02-24

We have already seen two use-cases for OPAQUE: authentication and securing a channel. A third less obvious - if you think of it as a (P)ake - use of OPAQUE is to store and retrieve static and probably sensitive data. In the previous example we always ignored the export-key, in this installment it will be the main instrument.

The export-key

The export-key is a key derived from your password during registration. This can then be used to encrypt additional data that is independent of the data needed for a run of the OPAQUE protocol.

Where to store the encrypted blob?

Export-key encrypted data can then be stored anywhere, but it makes most sense to store it on the server running OPAQUE. This allows a client to still remain free of any account specific state.

The blob could be of course stored on the client, but then if you are doing multi-device setups you have to sync it between all your devices.

Or you could store this data at another server, in which case your multi-device clients still need to sync at least the address pointing to this encrypted blob.

So the simplest choice in a multi-device setting is to store the blob next to your OPAQUE user record on the OPAQUE server.

What to store in the blob?

Well this is an excellent question, it could be some crapto wallet key, some other password, some long-term key-pair, user ids to some service or simply the anniversaries of/with your spouse. Maybe if you are a ransomware group you could store the unlock key in such a blob? (just kidding)

The example

You can find the complete source code to the following example in the git repo. You can also try out the example as a live demo. Unlike with the previous demo we do not provide a registration flow. There is one hardcoded OPAQUE record and encrypted message on the server. This also allowed us to get rid of the username entry in the "form". The correct password "password" of the hard-coded opaque blob will give you a short message, while anything else a failure. Let's dive into the example.

0. Starting a web worker and communication with it.

Since this example is running in the browser, we start a web worker thread so that the main thread of the page is not blocked while the OPAQUE protocol runs. This is how we start and dispatch between main thread and webworker:

index.js

(function () {
  "use strict";

  var button_fetch = document.getElementsByName("fetch")[0];

  function fetch(event) {
    postMessage("fetch");
  }
  button_fetch.addEventListener("click", fetch);

Here we just bind the button to trigger the web worker when clicked.

var pre = document.getElementsByTagName("pre")[0];

function postMessage(action) {
  var pw = document.getElementById("pw").value;
  // Send a message to index-worker.js.
  // https://developer.mozilla.org/en-US/docs/Web/API/Worker/postmessage
  pre.innerHTML = "<br>" + pre.innerHTML;
  worker.postMessage({ action: action, pw: pw });
}

This is our wrapper that logs any messages to the web worker to our makeshift "console".

// Use a web worker to prevent the main thread from blocking.
// https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers
var worker = new Worker("/index-worker.js");

This instantiates our web worker with the code doing all the OPAQUE back-and-forth.

  // Receive a message from index-worker.js.
  // https://developer.mozilla.org/en-US/docs/Web/API/Worker/onmessage
  var i = 0;
  worker.onmessage = function (e) {
    if (e.data.printErr)
      pre.innerHTML = i++ + ": " + e.data.printErr + "<br>" + pre.innerHTML;
    if (e.data.print)
      pre.innerHTML = i++ + ": " + e.data.print + "<br>" + pre.innerHTML;
  };
})();

And finally a callback for any messages coming from the web worker to be printed to our simple "console".

More initialization happens in the web worker itself, it initializes a Module object which is really just boilerplate generated from emscripten. The most important part there is the "root.onmessage" callback which dispatches the commands coming from the main thread. We omit this code here, as it is mostly generic boilerplate. The curious among you might have a look at it in the git repo.

1. The client initiates a credential request

When the fetch button on the HTML page is clicked, the main thread sends a request to the web worker thread, which initiates the OPAQUE protocol:

index-worker.js:

function requestCredentials(module, pw) {
  try {
    var request = module.createCredentialRequest({ pwdU: pw });
    var pub_base16 = module.uint8ArrayToHex(request.pub);
    var xhr = new XMLHttpRequest();
    xhr.open("POST", "/request-creds", true);
    xhr.onreadystatechange = function () {
      var response = onreadystatechange(module, xhr);
      if (response) recoverCredentials(module, response, request);
    };
    xhrSend("request=" + pub_base16, module, xhr);
  } catch (e) {
    module.printErr(e);
  }
}

Everything pretty straightforward, creating a request, serializing and sending it with a "XMLHttpRequest()" and chaining the final OPAQUE step in the "onreadystatechange" callback.

2. The server created a response and sends it back with the blob

In our demo the server is implemented in ruby, using the sinatra framework. In the example below the hardcoded OPAQUE user record and the hardcoded encrypted blob are omitted for brevity. The result is small and simple:

post '/request-creds' do
  request.body.rewind
  req = hex_to_bin(params['request'])
  rec = hex_to_bin("an opaque user record encoded as hex")
  blob = 'some encrypted blob encoded as hex'
  resp, _, _ = create_credential_response(req, rec,
                                          "demo user",
                                          "demo server",
                                          "rbopaque-v0.2.0-demo")
  content_type :json
  { response: bin_to_hex(resp), blob: blob }.to_json
end

The server side is really simple as you can see. The final step on the client is not much more exciting:

3. The client recovers its credentials and decrypts the blob

The response from the server is received through the "onreadystatechange" callback of the XMLHttpRequest, which calls this function:

index-worker.js:

function recoverCredentials(module, response, request) {
  const ids = { idS: "demo server", idU: "demo user" }
  const context = "rbopaque-v0.2.0-demo";
  try {
    var resp_base16 = response.response;
    var credentials = module.recoverCredentials({
      resp: module.hexToUint8Array(resp_base16),
      sec: request.sec,
      context: context,
      ids: ids,
    });
    const blob = module.hexToUint8Array(response.blob);
    module.print("Decoded blob: " + xor(credentials.export_key, blob));

  } catch (e) {
    module.printErr(e);
  }
}

Again nothing really surprising here, parameters get deserialized and "recoverCredentials()" is called. The only result we care about in this case is now the export-key, which in our case is used as a kind of one-time-pad to decrypt the message received in the encrypted blob. If the export-key is correct the message will decrypt in any other case gibberish will be the result.

Some Warnings

It is importantt to use real encryption with the export-key and the blob you want to protect, use something like "cryptosecretbox" from libsodium.js or similar. Do not use the simple one-time-pad mechanism used in this example, unless you really do understand what the implications of that are.

It is also important to note, that the live demo uses a debug version of libopaque which - not only dumps trace messages, but also - does not use any random source thus everything is always deterministic. Thus do not copy the libopaque.debug.js and deploy it in your own production setup, it is not secure! You have to build your own libopaque.js, or get one that is not compiled with "-DNORANDOM".

If you have the idea to implement a password manager storing passwords in the export-key protected blobs, that is a great idea! I had the same. There is only one problem, you cannot use OPAQUE authentication as a way to authorize change and deletion of export-key blobs, as this voids the offline-bruteforce resistance of OPAQUE for the server operator, which is something you really don't want to do (we tried, don't be like us. learn from our faults!).

Summary

In this post we have seen how to use the OPAQUE export-key to protect some at-rest blob. The ruby server code shows clearly how simple and how little is needed to implement this. The javascript client implementation is a bit more work, but most of it is either boilerplate, or based on functionality that most javascript frameworks provide already. It really is a bit unfair to compare something written with sinatra to something vanilla js.

This post concludes the series on the generic use of OPAQUE, we hope you found this useful and will make good use of libopaque in your own system.


How to use OPAQUE for setting up a secure channel

2022-02-22

I heard you need to write a tool that sets up a secure channel between two peers, but you cannot afford the luxury of using certificates for doing so. Maybe you want to develop a less braindead protocol than shadowsocks? All you have is a password? How about combining OPAQUE with a double-ratchet? You heard about OPAQUE, but are afraid to ask how to do this? Fear not, the following bits are here to enlighten you.

A warning

First a warning, since the security of the channel depends on the strength of your password it is essential that password is of high entropy and that your server employs some kind of rate-limiting, and possibly a whole range of other limitations and defense-in-depth mitigations. If possible try to use certificates, preferably in HW tokens though. Of course the best solution would be to use a strong password storage like SPHINX.

An example

The source code to this example can be found in the demos/chan-c-go directory of the libopaque sources. The example shows a simple client written in C connecting to a server written in Go. The client takes the password and the ip address of the server as command-line parameters. The server has one user record hard-coded that opens with the "super secure" password: "password". After completing an OPAQUE run, the client and server tries to exchange a message which is protected by the shared session key derived via OPAQUE and used with cryptosecretbox mechanism from libsodium. This will only succeed if the user provided the correct password at the command-line of the client.

Let's have a look at the relevant steps the client and server execute.

First step: client initiates

Initiating a OPAQUE session always starts with the client and the user providing the password to it. Well wrap the whole OPAQUE flow in a function that takes a password as input, and when everything is done returns the shared session secret:

#include <stdint.h>
#include "opaque.h"

int get_session_secret(const int sock,
                       const uint8_t *pwdU,
                       const size_t pwdU_len,
                       uint8_t sk[OPAQUE_SHARED_SECRETBYTES]) {

  // let's prepare to make create a credential request, we need some
  // data to store it in:
  uint8_t request[OPAQUE_USER_SESSION_PUBLIC_LEN];
  uint8_t ctx[OPAQUE_USER_SESSION_SECRET_LEN+pwdU_len];
  // ctx is sensitive data we should protect it! with c you can
  // actually protect sensitive data much better than with other
  // languages, sodium wraps this up nicely and portably:
  if(-1==sodium_mlock(ctx,sizeof ctx)) {
    fprintf(stderr,"Failed to protect sensitive context\n");
    return -1;
  }

  // let's create the credential request
  if(0!=opaque_CreateCredentialRequest(pwdU, pwdU_len, ctx, request)) {
    fprintf(stderr,"Failed to create credential request\n");
    return -1;
  }

  // send off the request to the server
  if(sizeof request != write(sock, request, sizeof request)) {
    //fprintf(stderr,);
    perror("failed to send the request\n");
    return -1;
  }

....

Second step: the server prepares a response

The server must read a user id and a request from the connection. Using the user id it can load the according user record, and with it can create a credential response.

import libopaque

...
  // we need to store the request
  request := make([]byte, libopaque.OPAQUE_USER_SESSION_PUBLIC_LEN)
  // read the request from the connection
  b, err := c.Read(request)
  if err != nil || b != libopaque.OPAQUE_USER_SESSION_PUBLIC_LEN {
    fmt.Println(b)
    panic(err)
  }

  // create a response based on the request, the hard-coded user
  // record, the ids and the context.
  resp, sk, _, err := libopaque.CreateCredResp(request, rec, ids, context)
  if err != nil {
    panic(err)
  }

  // send the response over to the client
  b, err = c.Write(resp)
  if err != nil || b != libopaque.OPAQUE_SERVER_SESSION_LEN {
    fmt.Println(b)
    panic(err)
  }
...

And this concludes the servers OPAQUE part, there is no further step necessary. The session secret should be used to setup an encrypted channel, like using the session secret as a key in a sodium/secretbox, or perhaps indeed feeding it to a double ratchet. Authentication will be implicit, if the authentication of any packages between the server and the client fails, then there is something afoul and you should abort your connection.

Last step: the client finishes the session setup

The last step is quite simple, the client reads the servers response, and recovers its credentials. Most notable are the two NULL parameters, one for the authentication token, which we don't need since we are doing implicit user authentication. And the other NULL is for the exportkey which we also do not need in our case.

  ...
  // we need to store the servers response
  uint8_t response[OPAQUE_SERVER_SESSION_LEN];
  // receive a response from the server
  if(sizeof response != read(sock, response, sizeof response )) {
    perror("failed to read the response\n");
    return -1;
  }

  // we need to supply the same context and user ids to the final step
  // as have been used by the server
  const uint8_t context[]="context";
  const Opaque_Ids ids={4,(uint8_t*) "user",6,(uint8_t*)"server"};
  // we recover the shared session key, and we set the authorization
  // token and the export_key parameters to NULL since we do not care
  // about them in this demo.
  if(0!=opaque_RecoverCredentials(response, ctx, context,
                                  strlen((char*)context), &ids, sk,
                                  NULL, NULL)) {
    fprintf(stderr,"Failed to recovercredential\n");
    return 1;
  }
  // yay everything went fine.
  return 0;
}

The result of this function is in the parameter sk, which should be fed into a counter-part of whatever the server is doing - be it a secretbox, or a double ratchet.

Summary

There is strong initiatives trying to get rid of passwords, and in some cases it makes sense. However there will be many use-cases where this cannot work. If the usage of certificates or hardware tokens is not possible, passwords if strong enough can still provide adequate protection even for protected communication channels. The nice thing about passwords is, that you do not need anything to use them, nothing to carry, nothing to sync, no state, no worries (ok, you still need a client though). And in this example and the accompanying demo we showed how simple it is to use OPAQUE with a password to create a forward secure communication channel between a client and a server.

Don't forget to come back for the next episode where we'll have a look how to store securely some sensitive data with OPAQUE.


opaque demo

2022-02-17

I just published a live demo of the authentication using OPAQUE discussed in my previous post.

The backend is a simple python flask app. It has one nifty trick, the server is stateless between steps of the protocol. Both the during first step of registration and authentication the server creates a sensitive context that it needs to protect until the second server step. However imagine your setup sitting behind a load-balancer, then all the workers, would have to somehow synchronize this state. Instead they all share one symmetric encryption key, with which the server state is encrypted and sent to the clients, who have to send it back for the final server step.

The javascript frontend is a bit more complex, in order to not block the main thread, it uses a webworker. The initial javascript file index.js is only a wrapper that handles messaging between the page and the worker. The magic happens in the index-worker.js. It's pretty straight-forward, even if there is quite some boilerplate at the end of it. But that boilerplate is necessary to load the webasm module and communicate with it and the wrapper index.js.

Want to learn more about libopaque, or how to improve the safety of your application in a simple way - you can read more about it on the ]libopaque project page.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


pocorgtfo 21 12 apocrypha

2022-02-17

In POC||GTFO 21:12 I revealed my pilgrimage to reveal a devilish backdoor forced by the NSA on the PX1000. Not all of my outputs have been included in these blessed pages. Below notes on the margins of these.

For your convenience I also prepared a git repo with all the code that was used in finally defeating this infernal construction.

I would also like to mention that Act I & II are also available on your favorite streaming platform as a talk:

Epilogue

A few days after sharing this solution with the fine cryptomuseum people, I got a hint. The taps for the LSFRs are much less complicated than thought, and they're hiding in plain sight. Remember this line from the loop updating the 16 byte LFSR block (page 62, first code example, line 5):

acc ^= lfsr[i] & lookupTable[(round_counter+i)%16];

Well this lookupTable is the taps of the four LFSRs, only the taps are bit-sliced just like the LFSRs themselves. The only magic was that the roundcounter starts from 0x1f, thus the 15th byte was the first to process for extracting the taps. The following function from the lfsr32.py script takes care of extracting the nth LFSR taps:

taps = (
   0x06, 0x0B, 0x0A, 0x78, 0x0C, 0xE0, 0x29, 0x7B,
   0xCF, 0xC3, 0x4B, 0x2B, 0xCC, 0x82, 0x60, 0x80)

def extract(taps, i):
    # left to right
    tr = [''.join(str((taps[j] >> b) & 1) for j in range(16)) for b in range(8)]
    # horizontal bottoms-up lines appended
    return (tr[(i*2)+1]+tr[(i*2)])

for i in range(4):
  print(extract(taps, i))

Running the above code, results in these left-zero-padded results:

32 bit: 11100001111101000100001111110000  e1f443f0
31 bit: 01111011101110001000100010001000  7bb88888
29 bit: 00010111000100100001000100000000  17121100
27 bit: 00000100110011010001010111101010  04cd15ea

Each of these tap configurations create LFSRs that are of maximum cycle size. Implementations of these are included in the lfsr.c file.

Final thoughts

I do not know if the NSA did have a SAT solver like z3 back in 1983, but the fact that I can recover a key 40 years later within 4 seconds on a laptop CPU in a single thread - while I am far from being able to do so for the same if DES would be used -, lets me conclude that the PX1000cr algorithm is indeed a confirmed backdoor.

As noted above though, the equation system that needs to be solved is quite huge (about 20 something megabytes when printed out as ASCII text), it is safe to assume that the NSA needed much more efficient ways in the 80ies to break this algorithm.

A very simple and cheap attack which also amortizes the costs of the algebraic attack is based on the fact that the first and last character and the top bits in between leak the keystream. This makes it almost trivial to detect key reuse. And this being a stream cipher key reuse is catastrophic in cryptographic sense, a good example of this is the Venona project and Julius and Ethel Rosenberg's fate. We can say that the NSA has for decades worked on recovering plaintext from two-timepads, but in the Venona this was more difficult as they didn't know which two messages were sharing the same key. With the PX1k this is much more trivial. For Venona it's a solid bet that the NSA built a lot of HW that recovers plaintext from two plaintexts xor-ed together. This means they only need to use the more expensive algebraic or other attacks for cryptograms with unique keys. I believe this is an important insight, there's not just one backdoor here, but multiple ones that work together.

Also notable is that this backdoor is definitely predating the NOBUS - nobody but us - policy. I heard that other intelligence services did notice the change of algorithm in the PX1000 and also had done their own analysis.


Why and how to use OPAQUE for user authentication

2022-02-17

Storing clear text passwords in your user database is very stupid.

Traditionally we store hashed passwords of users. When a user logs in, then the user sends their cleartext password to the server (hopefully protected by TLS), the server hashes the password and compares it to to the stored hash, if it matches, all is good. Lets call this symmetric authentication, where both the client and the server knows the cleartext password.

A simple threat model

With authentication you have the channel over which the token is transmitted:

  • here you don't want a passive adversary learn your cleartext password (sniffing),
  • or an active adversary being able to replay your password as in pass-the-hash attacks (replays).

You say that but its all protected by TLS, maybe, but maybe your TLS terminates at cloudflare or another content delivery network (CDN), or the TLS configuration fails for some reason (e.g. certificate verify = false, or you push your private key to github), or your certificate store has certificate authorities that are controlled by some adversary.

On the server itself:

  • you want to be safe against SNAFUs like when Twitter was logging clear text passwords, or heartbleed is leaking them. Also you don't want active attackers on the server to be able to read the plaintext passwords (direct password leaks).
  • And finally you want to avoid in case an attacker leaks your user database that there are already pre-computations available that simplify the recovery of passwords from hashes into cheap lookups, so you use salts. (pre-computation)

You want to protect those passwords of your user so that in case your password database leaks, you want to make it as difficult as possible to anyone to recover passwords, in case those passwords are reused at other sites and enable attackers to abuse these accounts.

So you really don't want your cleartext password to be sent to the server, ever. And it's possible, for example CRAM - a challenge-response protocol - eliminates the sniffing vector. The socialist millionaire protocol (SMP) eliminates both sniffing and replay issues. But they still need the cleartext password on the server. So, that leads us to…

Asymmetric authentication

If there is symmetric authentication, there must be also asymmetric authentication, where only one party knows the password. And the nice thing about asymmetric auth is that it immediately eliminates sniffing and direct password leaks problems.

Let's hypothesize a simple hash-based protocol where password is never sent in cleartext to the server. If instead of directly sending cleartext password,

  1. since we're hashing on the client, we need to know the salt with which the server has the password hashed with during registration, so the client first asks for the global or user specific salt.
  2. the server sends the salt via tls, then
  3. client calculates hash(per-user-salt, pwd) sends to server, and
  4. server hashes this again (otherwise the once hashed password could be used by anyone having access to the user database) and compares that to the stored value that has also been hashed twice.

In case of SNAFU (TLS fail, MitM, CDN, attacker on server, twitter fumbling, etc) the salt leaks, and possibly a unique hash, that needs to be bruteforced to recover the password. However there are a few problems:

  1. the salt must be requested from the server, adding 2 more steps to the auth protocol. Caching the salt TOFU-style in a cookie solves this.
  2. salts mustn't leak if a user exists or not, that means for non-existing users, the same salt must be returned for the same non-existing username. This can be solved by calculating the salt like this:

    salt:=hmac(username, serverkey)
    
  3. global salts enable an attacker to acquire the salt and pre-calculate.
  4. if a username is known, an attacker can query the salt and prepare a pre-calculation.
  5. the hashed password can still be used in a pass-the-hash style replay attack by anyone stealing it.
  6. an active attacker could respond with a salt, for which they have a pre-computation, and then recover the client's password from the received hash.

Of course this hypothetical protocol is only a gedankenspiel, an example of why we should not invent our own crypto, and that every solution also bears new problems.

What we really want is an asymmetric Password-Authenticated Key Exchange (aPAKE), of which the most prominent one is SRP. Coincidentally the first versions of SRP also were fraught with serious issues.

With OPAQUE you never send your password to the server, eliminating sniffing and replay attacks, and an attacker has no pre-computation opportunity. Since using OPRF shields the "salt" and the password from an attacker OPAQUE also eliminate direct password leaks. Beyond this OPAQUE has proofs in a very strong model. Another benefit of using OPAQUE is that the memory-hard key-stretching function can run on the client, and thus reduces the attack surface for computational denial-of-service (DoS) vectors against the server. Also notable is that OPAQUE can run over an insecure communication medium, there is no need for TLS or anything else. Another nice feature of OPAQUE is, that the server can return a fake record to mitigate user enumeration attacks, an attacker will not be able to decide if the user exists or not.

How to Authenticate with OPAQUE?

So you want to eliminate the burden of cleartext passwords on your service, and you are willing to go from one message to 3 messages in the authentication dance. Assuming a user is already registered at your service, which might have been done by running the following binary as supplied by libopaque and storing the resulting record at the server:

echo -n password | ./opaque init user server >record 3>/dev/null

The following examples are using the javascript and python wrappers provided by libopaque.

1. The client initiates a credential request

For example using javascript in a web browser:

client.js:

const opaque = require("../dist/libopaque.debug.js");

(async () => {
  await opaque.ready;

  const pwdU = "password";
  let { ctx, pub } = opaque.createCredentialRequest({ pwdU });

  const userid = "user@example.com";
  send_to_server(request, userid); // you need to implement this fn
  ...

The client sends "request" over to the server, and holds onto "ctx" as securely as possible.

2. The server handles the "request" from the client

server.py:

from opaque import CreateCredentialResponse,
                    UserAuth,
                    Ids)

...

# server reads the request from the client
request, userid = get_request() # you need to implement get_request()

# load the record
record = load_record(userid) # you need to implement load_record()

# wrap the IDs into an opaque.Ids struct:
ids=Ids(userid, "servername")

# create a context string
context = "pyopaque-v0.2.0-demo"

# server responds to credential request
response, _, authU = CreateCredentialResponse(request, record, ids, context)

send_to_client(response) # you need to implement send_to_client()

The request is probably read from the network.

The user record that has been created during user registration is loaded probably from disk or a database based on the user id.

By default peers in OPAQUE are identified by their long-term public keys, in case you want to use something else as identifiers, you need to specify them when creating the credential response, in our example we use the userid as provided by the client and "servername".

Also important is to provide some context-string to prevent cross-protocol or downgrade attacks, hence we provide context string.

When creating a credential response the output does not need any kind of extra protection, it is already encrypted and authenticated.

Another output of this functions is a shared key, which is not needed in this case, where we are using OPAQUE only to authenticate.

However the third output, the users expected authentication token is needed by the server in the last step of this protocol.

3. The client recovers its credentials

client.js:

  ...
  const response = read_response(); // you need to implement this function

  const ids = { idU: userid, idS: "servername" };
  const context = "pyopaque-v0.2.0-demo";

  const {sk, authU, export_key,} = opaque.recoverCredentials({resp, ctx, context, ids});

  send_to_server(authU); // you need to implement this fn.
}

The client receives the servers response and uses its private context "ctx" from the first step, to recover its credentials.

The recovery needs the same ids and context string as the server was using.

The result of the recover credentials are:

  • a shared key, which is not needed in case we use OPAQUE only to authenticate,
  • the authentication token of the user, and
  • an export key which is also not needed for authentication-only use of OPAQUE.

Finally client sends the user's authentication token to the server to explicitly authenticate itself to the server. This concludes OPAQUE for the client.

4. The server authenticates the client

server.py:

authU0 = receive_authU() # you need to implement this function
# server authenticates user
if not UserAuth(authU0, authU): raise AuthenticationFailure

The server takes the user authentication token it generated in the second step and compares it to the token it received from the client, if it matches the user is authenticated.

Conclusion

With OPAQUE, you never send your password to the server, so there is nothing that can be sniffed, or replayed, nor can your password leak in any way. The "salt" is also never available to eavesdroppers, which makes pre-computation impossible. Furthermore memory-hard password hashing functions are running on the client, which makes computational denial of service attacks against servers less of a problem. And there is even support to mitigate user enumeration attacks. The only two major problems are phishing attacks, if you get tricked to reveal your password, then it's game over. And offline bruteforce attacks in case a user database leaks, but that should be also more difficult due to the used memory-hard password hashing function used.

All in all OPAQUE is a very efficent protocol with very strong security guarantees which, thanks to libopaque is easy to integrate into you application.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


opaque

2022-02-17

OPAQUE is a cool protocol that combines a Oblivious Pseudo-Random Function (OPRF) and an Authenticated Key-Exchange (AKE) into a protocol where a user holding nothing but a password and a server holding some information protected by the password can establish a shared secret. Because the password is used through an OPRF the server never learns anything about the password or the result of the PRF. A bit longer elaboration on this is written by Matt Green.

Over the last years I have been implementing this protocol. First it was part of libsphinx. Later it was split out into its own libopaque. Later the IRTF CFRG started on a draft specification, which is in constant flux and quite some effort to keep conforming to. This means also that libopaque itself is also unstable as long as there is no final version of the IRTF CFRG specification.

OPAQUE is a strongly recommended replacement for the SRP protocol.

What is it good for?

The shared secret which comes out of the AKE can be used to authenticate the client and the server to each other (the server is authenticated automatically), authenticating the client takes one extra message.

The shared secret can also be used to setup an encrypted channel between the two peers, for example it might be used as a root-key for a double-ratchet from the signal protocol. In this case there is no need for an explicit authentication of the user since the channel itself is implicitly authenticating the user.

Furthermore OPAQUE can be used as a kind of storage mechanism of sensitive data which is stored at the server and can be recovered by a password. For example this can be used to store another key, password or even a cybercoin wallet key.

Another nice feature of OPAQUE is that there is no tokens/messages being sent between the parties that can be replayed by an attacker between the client and the server.

Things OPAQUE doesn't solve

In case a server uses OPAQUE to set up sessions with users, and the user database leaks, an attacker can run offline brute-force attacks against this database to recover any user passwords, and possibly other data protected by this. OPAQUE does make this harder though, the IRTF CFRG draft specifies scrypt as a memory-hard stretching function. Unfortunately the draft does not specify argon2i as such, since there is no I(ER)RTF specification for this. In libopaque argon2i is used though, which should make an attackers life quite more difficult.

Another thing that OPAQUE doesn't solve is phishing attacks in a browser, as soon as someone enters their password into a phishing page in a browser, all is lost. But this is really a problem with browsers which are broken-by-design (in so many ways).

There is a whole lot of other things that OPAQUE doesn't solve. But still it is much better than most other ways to authenticate users with their passwords stored or hashed.

A cool attack

Being one of the first and more mature implementations made it a target for a fabulous attack called Partitioning Oracle Attacks. The attack is based on the fact that some protocols that derive keys from passwords use these keys with algebraic authentication mechanisms, such as Poly1305 or the GCM mode of AES. Due to an algebraic attack it is possible to construct messages that authenticate under a number of password candidates, and thus with just one message and the response to it is possible to test whether one out of many passwords actually is a valid one. And thus instead of sending one message per guessed password, an attacker can send one message and say test 5000 passwords. As far as I understood each tested password increases the message size.

Regarding that paper, there is one quote which needs correction though, the authors write:

By default, libsphinx accepts a C∗ only up to length 4 MB due to a memory management bug — it crashes for larger ciphertexts due to a statically allocated buffer. Once fixed, it accepts cipher-texts of up to 2 GB. This would enable splitting ciphertexts with degree up to k = 1.25×108.

As the saying goes: "this is not a bug, this is a feature", an intentional feature. libopaque implements everything with variables allocated on the stack, there is no usage of the heap. Normally your Unix limits the size of the stack, you can check this by running

ulimit -s

Which in my case reports 8MB, and if something writes more than space is available on the stack, then the application crashes, it's an out of memory condition and guard pages raise a segfault. The authors wrote: "once fixed", meaning they allocated a huge 2GB buffer using malloc() on the heap (running ulimit -s 2147483648 would've also done it). This is kind of limitation applies to all languages, this even happens the same way in a rust implementation.

Anyway, even before publication of that awesome paper I added more defense-in-depth measures like also maximizing the size of the "C*" buffer to only a couple kilobytes. So libsphinx despite being theoretically vulnerable had a bunch of defensive-coding measures deployed that made the original code quite resistant to this attack. The root-cause of the attack has been fixed independently before the publication of the attack paper.

Bindings

libopaque comes with a lot of native bindings: erlang, go, java, js, lua, php7, python and ruby are part of the core library, and my dear friend dnet maintains a rust binding opaqueoxide. Also if you like to do things in shell scripting - using socat or similar tools - there is standalone binaries for each step of the protocol.

Thanks

I'm grateful to creemama who supplied the initial implementation of the js bindings.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


oprf

2022-02-17

Oblivious Pseudo-Random Functions

In my previous SPHINX post I hand-waved away how the SPHINX oracle learns neither the input nor the output password. The mechanism behind this is a less well known cryptographic primitive, that is very nifty. Let's start with what is a pseudo-random function (PRF)? According to Boneh - Shoup Graduate Course in Applied Crypto:

A pseudo-random function (PRF) F is a deterministic algorithm that has two inputs: a key k and an input data block x; its output y := F(k, x) is called an output data block. Intuitively, our notion of security for a pseudo-random function says that for a randomly chosen key k, the function F(k, ·) should — for all practical purposes — “look like” a random function...

... that maps the input blocks to the output blocks in a way that no attacker can distinguish if the output blocks are computed using a PRF or if are the output of a random oracle that always returns the same output for the same input.

Some block ciphers can be considered PRFs, for example:

output_block := AES(key, input_block)

This is it, this is not about encryption, or decryption, it's simply the deterministic mapping of two-blocks. The PRF doesn't have to be reversible, there is no encrypt paired with a decrypt.

Another example of a PRF could be the scalar multiplication of a point X on an elliptic curve:

Y := X * k

In this case the point X is the input block, the scalar k is the key, and Y is the output block.

Oblivious PRFs are PRFs where two parties are computing a PRF in a way, where one party knows the input block, and the other one knows the key, and neither learns anything about the other, while one of them learns the output block. Constructing an OPRF on elliptic curves is only possible if the curve is a prime-order group. For example the basic ed25519 does not qualify, but a slight mutation of it - ristretto255 - qualifies. In which case an OPRF can be constructed as follows:

Alice generates a random blinding factor r, which is a scalar, and multiplies her input block with this r:

alpha := input_block * r

She sends alpha over to Bob, who has the key k and wants to keep it secret, so Bob calculates beta from alpha and the key:

beta := alpha * k

And then sends back beta to Alice, who then can then unblind the result, by multiplying beta with 1/r:

output_block := beta * 1/r

if we expand the last calculation we can see how r is eliminated by 1/r:

output_block := input_block * r * k * 1/r

and by cancelling out r and 1/r we get:

output_block := input_block * k

The whole magic is in the application of the blinding factor r, because of that Bob doesn't learn anything about Alice's input nor output block. And for some fundamental reason related to elliptic curves in general Alice does not learn anything about Bobs key. You could say that the application of the blinding factor r is a kind of encryption of both the input and the output block.

And this is a simplified explanation of how an OPRF works. And it is also exactly how it is used in the SPHINX protocol to "store" your passwords, in SPHINX the input block is the hash of your input password mapped to a point on the ristretto255 curve. And the output block is further hashed with your input password again and the host name and username of the account the output password belongs to, and then a memory hard password hash function is applied so that any kind of brute-force attack is slowed down and requires significant amount of memory. The final result without the OPRF steps is then the following:

rwd := blake2(pwd||ristretto255_fromhash(blake2(pwd))*k)

salt := blake2(hostname||username, client_key)

binary_password := argon2i(rwd, salt)

Other OPRFs

The above OPRF has been extended into VOPRFs where the V stands for Verifiable. A VOPRF ensures clients can verify that the server used a specific private key during the execution of the protocol. A VOPRF can also be partially-oblivious, called a POPRF. A POPRF allows clients and servers to provide public input to the PRF computation. For me the most exciting extension of the OPRF concept is a threshold OPRF, where a user interacts with t-out-of-n servers to calculate the OPRF.

Conclusion

OPRFs are a nifty crypto primitive that is useful in many different scenarios, besides SPHINX, I also have an implementation of the OPAQUE protocol which also has an OPRF as it main enabling cryptographic primitive. More on OPAQUE soon, here.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.


sphinx

2022-02-17

Announcing the SPHINX

TL;DR if you are using keepass, pass, bitwarden, or similar password managers, you might want to switch to sphinx to handle your passwords (certain caveats apply see below).

SPHINX is a simple self-hosted online password "storage" protocol with some very strong security guarantees, which other password storage solutions do no possess. One of the authors of the protocol jokingly said that SPHINX could even be operated by the NSA and it would be safe (in the context of the NSA being a global adversary of course).

How does it work?

You ask the SPHINX to return the password for a given host and username pair and you supply your input password, the SPHINX oracle contributes some seed to this, without the SPHINX learning anything about your input nor your output password. In very simplified terms1 your output password is something like:

output_password = hash(input_password)*sphinx_seed

(it is a bit more complicated than that, for details see the next blogpost, the original academic paper, and our whitepaper about our extensions)

Since SPHINX is an online password storage, you do not have the problems caused by trying to sync a database across different devices.

If you like audio-visual content, here is one of the authors of the protocol explaining SPHINX in the first 20 minutes of this video.

And here is me, showing an early implementation of SPHINX in the first half hour of this video.

How is SPHINX different?

The storage server does not store any passwords, and thus an attacker accessing the storage does not learn anything about the passwords.

In a conventional password manager your input password unlocks a kind of database of passwords and reads out the password decrypted by your input password. Thus your input password is the key to all of your kingdom. Unlike with traditional password managers, you can use different input passwords with SPHINX. This means you can have some simple passwords you don't care about too much, and have a super sophisticated that protects the passwords you care about most.

The passwords are unique and of high entropy - like they would be coming out of a password generator -, making any kind of guessing attack computationally impossible. If a website gets hacked and the password database is leaked you do not have to fear a bit even if mediocre password hashing was protecting your output password in the leaked database.

If one of your output passwords gets leaked, an attacker can only run an online bruteforce attack to recover your input password. This means the attacker needs to have access to the SPHINX oracle to recover your input password from the output password.

Unlike traditional password managers, there is very limited support to store predetermined passwords. It is not recommended, but if you insist you can store up to about 40 character long passwords with SPHINX. If you really need to store longer or more sophisticated passwords and keys, then it is recommended to use some dedicated tool like age, pass, passage to store these while using a SPHINX output password to unlock them.

What are the limits of SPHINX?

SPHINX cannot do shared passwords in a team. It would compromise the security guarantee of the protocol.

SPHINX cannot store arbitrary secrets. Although short secrets are supported, doing so also weakens the security properties of the protocol - but only to the level of conventional password.

SPHINX is not a commercially supported polished end user product, currently we have a command line client in Python, an Android client and browser extensions for Firefox and Chrome derivatives. You are welcome to contribute further frontends, e.g. for Apple devices.

There are no backups to users of their password storage, since the server doesn't even know which records belong to the same user. Operators of servers must ensure full backups for restoring in case of data corruption of the records. Users either backup themselves their output passwords (not recommended) or they have to rely on the services they use having a password reset procedure.

Where's the code?

Glad you asked! I feel like you are the perfect match for hosting a SPHINX oracle for your community with this attitude!

The latest version of the production server is being dogfooded at https://sphinx.ctrlc.hu/ - in case you don't mind that this is also where we experiment with new versions and (sometimes even backward-incompatible) updates, then by all means use it and report issues back to us.

Thanks

I would like to express thanks, kudos and much love to hugo, dnet, d3v, rolf, asciimoo, jonathan, michiel, fabs, stefan and all those that I will be ashamed by later by forgetting to mention them here...

Oh - before I forget, thanks to all you future users hosting their own SPHINX oracles for their communities.

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.

1. hand-waving away the details why the oracle does not learn anything about your input nor your output, but if you are interested look up what an Oblivious Pseudo-Random Function is.


pitchfork

2017-03-21

After years of training journalists and NGOs communication and operational security, after years of conducting research into the tools and protocols used, it took some more years developing a reasonable answer to most of the issues encountered during all this time.

In todays world of commercially available government malware you don't want to store your encryption keys on your easily infected computer. You want them stored on something that you could even take into a sauna or a hot-tub - maintaining continuous physical contact.

So people who care about such things use external smartcard-based crypto devices like Ubikey Neos or Nitrokeys (formerly Cryptosticks). The problems with these devices is that you have to enter PIN codes on your computer that you shouldn't trust, that they are either designed for centralized use in organizations, or they are based mostly on PGP.

Acquiring, verifying, trusting and using the correct PGP keys from your peers is also a delicate operational security dance where lots of steps can be easy to mess up. A proper device would be able to directly exchange keys with other similar devices, so that it becomes easier with much less opportunities to err. Another shortcoming of PGP is it's use of aging cryptographic primitives. An adequate device would deploy post-quantum algorithms with protocols that allow forward secrecy, peer anonymity, and other modern concepts missing from PGP.

A well-designed device must also come with a proper threat model. A threat model explains the defensive capabilities and the limits of any security device by making assumptions about the attacker. So that the user can understand how and against what a device protects, what is based on assumptions and what on proofs.

One of Snowdens revelations provided evidence for interdiction attacks, rerouting packages for backdooring hardware while it is shipping to the customer. An ideal device could be bought and assembled locally without leaving a window of opportunity for interdiction attacks. For the most paranoid (or their trusted friends) it should be possible to buy all parts in a local store and assemble such a hardware device at a local workshop. Having all designs and software available for free makes it easy to customize and extend such a device.

You also want a device that doesn't draw attention, you want something like a phone, a smartwatch or a USB stick.

You want a PITCHFORK.

I'm happy to introduce Project: PITCHFORK and to announce the publicly availability of all related sources.

Project:PITCHFORK is an attempt to produce tools to improve and research operational security of individuals and groups.

The PITCHFORK is a small USB device which is a cryptographic swiss army-knife. This is the original concept from 2013 framed as an NSA leak (which confused quite a few friends of mine back then):

nsa-style

PITCHFORK.pdf

Here's the official site:

https://pitchfork.ist, the wiki and most importantly all the related git repos.

If you're into embedded or crypto development the PITCHFORK is a serious device that contains a lot of fun. Cool people from the TU/Nijmegen regularly dump out nice crypto code that is optimized for the Cortex-M series. The PITCHFORK serves three goals, to protect our keys, and to provide a platform for building and breaking crypto on embedded platforms.

A bit of history

Development started in 2013, with the experimental PGP replacement PBP, by trying to run curve25519 operations on a r0ket, and the now quite popular libsodium wrapper pysodium. In late 2013 I got my development board, an Open207 from waveshare and had the first USB storage controller firmware and initial PITCHFORK firmware ready.

In May 2014 I started pyrsp, a tool to make development easier by allowing python scripting directly to control the cpu over Serial Wire Debug (SWD) protocol, which is similar to JTAG but uses less wires. A talk about pyrsp became a hit even on hackaday. A bit later I figured out how easy it is to look for PGP encrypted messages, a variant of that even made it into the file(1) magic signatures. In parallel to pyrsp I started to design the board, inspirations were the original r0ket and the bitcoin trezor project.

The first boards arrived early 2015, but until early summer work was suspended, the first bugs were identified and more during our camp++ where I also gave a talk on the progress (or lack thereof).

Work was suspended until beginning 2016, when a 2nd batch of boards was ordered with all bugs fixed - and a few new ones. Lots of work was done on the HW and the firmware in the first half of 2016, also a Nokia 3310 version was designed and ordered. At the camp++ in 2016 I gave another talk. And we also started to work on the Reflowmaster2000plus Deluxe Pro - a reflow-oven - so that you can indeed bake your own PITCHFORKS at home in your toaster. A first closed beta was run with 15 PITCHFORKS given to contributors.

I'm currently looking to find a good manufacturer - which also does design and produces rugged/waterproof/shielded cases. When I find one finally there'll be a crowdfunding campaign where you can acquire working pitchforks as perks and where you can sponsor future research and development of Project:PITCHFORK.

I must say it was truly an exciting project so far, crypto, low-level HW stuff, assembly, on a platform that reminds me of computing capacities of my early years. Lots of fun and learning. And lots of help from good friends - especially from the Hungarian Autonomous Center for Knowledge, the NLnet and the Renewable Freedom foundations, without their contributions this project would be stuck, and probably forgotten at the bottom of a todo list.


generating pgp ids

2013-04-04

A proper fingerprint from Wikipedia The tool I release today is genkeyid part of my gpk PGP key management suite, which is a tool that helps you bruteforce arbitrary PGP key ids by modifying the timestamp field of public keys so that the packet hashes to a given key id.

I also release setfp.py which allows you to set arbitrary timestamps in PGP RSA key pairs and recalculates the RSA signature accordingly. You might want to combine this with the other already previously released genkey tools.

The two steps are separated, because the bruteforcing does only need a public key, but setfp also needs an unencrypted private key. So if you want to have a special key id, but also maintain Key Management Opsec, you should do the patching offline in a clean system that you discard later.

For the truly ignorant and the ones having extra clean systems and lots of entropy available in bulk, there's genid.sh which does the two steps in one, generating as many unencrypted keypairs as necessary until a suitable is found.

Of course this is nothing new, there are existing examples of manipulated key ids. Some people have issues with the ambiguity of key ids, but one of the authors of PGP says this is ok. The PGP FAQ has more on this.

get it from github/stef/gpk

Or read more: README.genkeyid.org


Announcing pwd.sh

2013-04-03

postits as password managers I wanted to switch to KeepassX to store all my passwords, but I wanted to use GPG to encrypt the passwords. So I came up with pwd.sh. It's a simple shell script that you can bind to your window manager keybindings, and when you invoke it, it uses the current focused window to deduce a key to store the user and the password. For better browsers like Firefox, Chromium, luakit and uzbl this means the currently loaded URLs, for all other windows the current window title. When creating a new password, it is automatically generated only the username is queried. I also wrote a small script that imports all passwords from Firefox into the new format. I'm very happy that now all my passwords isolated from my browsers and they are also protected by my PGP key on my external cryptostick.

When I showed this yesterday in our hackerspace, 2 members immediately installed and started massively improving pwd.sh, thanks asciimoo + potato!

So if you're running linux, like stuff based on the KISS principle, and are a crypto/gpg fetishist you might want to consider trying out this new "keepassx niche-killer" ;)

Check it out: pwd.sh



< prev posts

CC BY-SA RSS Export
Proudly powered by Utterson