Someday Never Comes: OAuth2, JWT, Open-ID Connect and other confusing things

Disclaimer

If feel I have to start this post with an important disclaimer: don’t trust too much what I’m about to say.
The reason why I say this is because we are discussing security. And when you talk about security anything other then 100% correct statements risks to expose you to some risk of any sort.
So, please, read this article keeping in mind that your source of truth should be the official specifications, and that this is just an overview that I use to recap this topic in my own head and to introduce it to beginners.

Mission

I have decided to write this post because I have always found OAuth2 confusing. Even now that I know a little more about it, I found some of its part puzzling.
Even if I was able to follow online tutorials from the likes of Google or Pinterest when I need to fiddle with their APIs, it always felt like some sort of voodoo, with all those codes and Bearer tokens.
And each time they mentioned I could make my own decisions for specific steps, choosing among the standard OAuth2 approach, my mind tended to go blind.

I hope I’ll be able to fix some idea, so that from now on, you will be able to follow OAuth2 tutorials with more confidence.

What is OAuth2?

Let’s start from the definition:

OAuth 2 is an authorisation framework that enables applications to obtain limited access to user accounts on an HTTP service.

The above sentence is reasonably understandable , but we can improve things if we pinpoint the chose terms.

The Auth part of the name, reveals itself to be Authorisation(it could have been Authentication; it’s not).
Framework can be easily overlooked since the term framework is often abused; but the idea to keep here is that it’s not necessarily a final product or something entirely defined. It’s a toolset. A collection of ideas, approaches, well defined interactions that you can use to build something on top of it!
It enable applications to obtain limited access. The key here is that it enables applications not humans.
limited access to user accounts is probably the key part of the definition that can help you to remember and to explain what OAuth2 is:
the main aim is to allow a user to delegate access to a user owned resource. Delegating it to an application.

OAuth2 is about delegation.

It’s about a human, instructing a software to do something on her behalf.
The definition also mentions limited access, so you can imagine of being able to delegate just part of your capabilities.
And it concludes mentioning HTTP services. This authorisation-delegation, happens on an HTTP service.

Delegation before OAuth2

Now that the context should be clearer, we could ask ourselves: How were things done before OAuth2 and similar concepts came out?

Well, most of the time, it was as bad as you can guess: with a shared secret.

If I wanted a software A to be granted access to my stuff on server B, most of the time the approach was to give my user/pass to software A, so that it could use it on my behalf.
This is still a pattern you can see in many modern software, and I personally hope it’s something that makes you uncomfortable.
You know what they say: if you share a secret, it’s no longer a secret!

Now imagine if you could instead create a new admin/password couple for each service you need to share something with. Let’s call them ad-hoc passwords.
They are something different than your main account for a specific service but they still allow to access the same service as they were you. You would be able, in this case, to delegate, but you would still be responsible of keeping track of all this new application-only accounts you need to create.

OAuth2 - Idea

Keeping in mind that the business problem that we are trying to solve is the “delegation” one, we want to extend the ad-hoc password idea to take away from the user the burden of managing these ad-hoc passwords.
OAuth2 calls these ad-hoc passwords tokens.
Tokens, are actually more than that, and I’ll try to illustrate it, but it might be useful to associate them to this simpler idea of an ad-hoc password to begin with.

OAuth2 - Core Business

Oauth 2 Core Business is about:

how to obtain tokens

OAuth2 - What’s a token?

Since everything seems to focus around tokens, what’s a token?
We have already used the analogy of the ad-hoc password, that served us well so far, but maybe we can do better.
What if we look for the answer inside OAuth2 specs?
Well, prepare to be disappointed. OAuth2 specs do not give you the details of how to define a token. Why is this even possible?
Remember when we said that OAuth2 was “just a framework”? Well, this is one of those situation where that definition matters!
Specs just tell you the logical definition of what a token is and describe some of the capabilities it needs to posses.
But at the end, what specs say is that a token is a string. A string containing credentials to access a resource.
It gives some more detail, but it can be said that most of the time, it’s not really important what’s in a token. As long as the application is able to consume them.

A token is that thing, that allows an application to access the resource you are interested into.

To point out how you can avoid to overthink what a token is, specs also explicitly say that “is usually opaque to the client”!
They are practically telling you that you are not even required to understand them!
Less things to keep in mind, doesn’t sound bad!

But to avoid turning this into a pure philosophy lesson, let’s show what a token could be

{
   "access_token": "363tghjkiu6trfghjuytkyen",
   "token_type": "Bearer"
}

A quick glimpse show us that, yeah, it’s a string. JSON-like, but that’s probably just because json is popular recently, not necessarily a requirement.
We can spot a section with what looks like a random string, an id: 363tghjkiu6trfghjuytkyen. Programmers know that when you see something like this, at least when the string is not too long, it’s probably a sign that it’s just a key that you can correlate with more detailed information, stored somewhere else.
And that iss true also in this case.
More specifically, the additional information it will be the details about the specific authorisation that that code is representing.

But then another thing should capture your attention: "token_type": "Bearer".

Your reasonable questions should be: what are the characteristics of a Bearer token type? Are there other types? Which ones?

Luckily for our efforts to keep things simple, the answer is easy ( some may say, so easy to be confusing… )

Specs only talk about Bearer token type!

Uh, so why the person who designed a token this way, felt that he had to specify the only known value?
You might start seeing a pattern here: because OAuth2 is just a framework!
It suggests you how to do things, and it does some of the heavy lifting for you making some choice, but at the end, you are responsible of using the framework to build what you want.
We are just saying that, despite here we only talk about Bearer tokens, it doesn’t mean that you can’t define your custom type, with a meaning you are allowed to attribute to it.

Okay, just a single type. But that is a curious name. Does the name imply anything relevant?
Maybe this is a silly question, but for non-native English speakers like me, what Bearer means in this case could be slightly confusing.

Its meaning is quite simple actually:

A Bearer token is something that if you have a valid token, we trust you. No questions asked.

So simple it’s confusing. You might be arguing: “well, all the token-like objects in real world work that way: if I have valid money, you exchange them for the good you sell”.

Correct. That’s a valid example of a Bearer Token.

But not every token is of kind Bearer. A flight ticket, for example, it’s not a Bearer token.
It’s not enough having a ticket to be allowed to board on a plane. You also need to show a valid ID, so that your ticket can be matched with; and if your name matches with the ticket, and your face match with the id card, you are allowed to get on board.

To wrap this up, we are working with a kind of tokens, that if you posses one of them, that’s enough to get access to a resource.

And to keep you thinking: we said that OAuth2 is about delegation. Tokens with this characteristic are clearly handy if you want to pass them to someone to delegate.

A token analogy

Once again, this might be my non-native English speaker background that suggests me to clarify it.
When I look up for the first translation of token in Italian, my first language, I’m pointed to a physical object.
Something like this:

Token

That, specifically, is an old token, used to make phone calls in public telephone booths.
Despite being a Bearer token, its analogy with the OAuth2 tokens is quite poor.
A much better picture has been designed by Tim Bray, in this old post: An Hotel Key is an Access Token
I suggest you to read directly the article, but the main idea, is that compared to the physical metal coin that I have linked first, your software token is something that can have a lifespan, can be disabled remotely and can carry information.

Actors involved

These are our actors:

Resource Owner
Client (aka Application)
Authorisation Server
Protected Resource

It should be relatively intuitive: an Application wants to access a Protected Resource owned by a Resource Owner. To do so, it requires a token. Tokens are emitted by an Authorisation Server, which is a third party entity that all the other actors trust.

Usually, when a read something new, I tend to quickly skip through the actors of a system. Probably I shouldn’t, but most of the time, the paragraph that talks describe, for example, a “User”, ends up using many words to just tell me that it’s just, well, a user… So I try to look for the terms that are less intuitive and check if some of them has some own characteristic that I should pay particular attention to.

In OAuth2 specific case, I feel that the actor with the most confusing name is Client.
Why do I say so? Because, in normal life (and in IT), it can mean many different things: a user, a specialised software, a very generic software…

I prefer to classify it in my mind as Application.

Stressing out that the Client is the Application we want to delegate our permissions to. So, if the Application is, for example, a server side web application we access via a browser, the Client is not the user or the browser itself: the client is the web application running in its own environment.

I think this is very important. Client term is all over the place, so my suggestion is not to replace it entirely, but to force your brain to keep in mind the relationship Client = Application.

I also like to think that there is another not official Actor: the User-Agent.

I hope I won’t confuse people here, because this is entirely something that I use to build my mental map.
Despite not being defined in the specs, and also not being present in all the different flows, it can help to identify this fifth Actor in OAuth2 flows.
The User-Agent is most of the time impersonated by the Web Browser. Its responsibility is to enable an indirect propagation of information between 2 systems that are not talking directly each other.
The idea is: A should talk to B, but it’s not allowed to do so. So A tells C (the User-Agent) to tell B something.

It might be still a little confusing at the moment, but I hope I’ll be able to clarify this later.

OAuth2 Core Business 2

OAuth2 is about how to obtain tokens.

Even if you are not an expert on OAuth2, as soon as someone mentions the topic, you might immediately think about those pages from Google or the other major service providers, that pop out when you try to login to a new service on which you don’t have an account yet, and tell Google, that yeah, you trust that service, and that you want to delegate some of your permissions you have on Google to that service.

This is correct, but this is just one of the multiple possibly interactions that OAuth2 defines.

There are 4 main ones it’s important you know. And this might come as a surprise if it’s the first time you hear it:
not all of them will end up showing you the Google-like permissions screen!
That’s because you might want to leverage OAuth2 approach even from a command line tool; maybe even without any UI at all, capable of displaying you an interactive web page to delegate permissions.

Remember once again: the main goal is to obtain tokens!

If you find a way to obtain one, the “how” part, and you are able to use them, you are done.

As we were saying, there are 4 ways defined by the OAuth2 framework. Some times they are called flows, sometimes they are called grants.
It doesn’t really matter how you call them. I personally use flow since it helps me reminding that they differ one from the other for the interactions you have to perform with the different actors to obtain tokens.

They are:

Authorisation Code Flow
Implicit Grant Flow
Client Credential Grant Flow
Resource Owner Credentials Grant Flow (aka Password Flow)

Each one of them, is the suggested flow for specific scenarios.
To give you an intuitive example, there are situation where your Client is able to keep a secret(a server side web application) and other where it technically can’t (a client side web application you can entirely inspect it’s code with a browser).
Environmental constraints like the one just described would make insecure ( and useless ) some of the steps defined in the full flow. So, to keep it simpler, other flows have been defined when some of the interactions that were impossible or that were not adding any security related value, have been entirely skipped.

OAuth2 Poster Boy: Authorisation Code Flow

We will start our discussion with Authorisation Code Flow for three reasons:

it’s the most famous flow, and the one that you might have already interacted with (it’s the Google-like delegation screen one)
it’s the most complex, articulated and inherently secure
the other flows are easier to reason about, when compared to this one

The Authorisation Code Flow, is the one you should use if your Client is trusted and is able to keep a secret. This means a server side web application.

How to get a token with Authorisation Code Flow

All the involved Actors trust the Authorisation Server
User(Resource Owner) tells a Client(Application) to do something on his behalf
Client redirects the User to an Authorisation Server, adding some parameters: redirect_uri, response_type=code, scope, client_id
Authorisation Server asks the User if he wishes to grant Client access some resource on his behalf(delegation) with specific permissions(scope).
User accepts the delegation request, so the Auth Server sends now an instruction to the User-Agent(Browser), to redirect to the url of the Client. It also injects a code=xxxxx into this HTTP Redirect instruction.
Client, that has been activated by the User-Agent thanks to the HTTP Redirect, now talks directly to the Authorisation Server (bypassing the User-Agent). client_id, client_secret and code(that it had been forwarded).
Authorisation Server returns the Client (not the browser) a valid access_token and a refresh_token

This is so articulated that it’s also called the OAuth2 dance!

Let’s underline a couple of points:

At step 2, we specify, among the other params, a redirect_uri. This is used to implement that indirect communication we anticipated when we have introduced the User-Agent as one of the actors. It’s a key information if we want to allow the Authorisation Server to forward information to the Client without a direct network connection open between the two.
the scope mentioned at step 2 is the set of permissions the Client is asking for
Remember that this is the flow you use when the client is entirely secured. It’s relevant in this flow at step 5, when the communication between the Client and the Authorisation Server, avoids to pass through the less secure User-Agent (that could sniff or tamper the communication). This is also why, it makes sense that for the Client to enable even more security, that is to send its client_secret, that is shared only between him and the Authorisation Server.
The refresh_token is used for subsequent automated calls the Client might need to perform to the Authorisation Server. When the current access_token expires and it needs to get a new one, sending a valid refresh_token allows to avoid asking the User again to confirm the delegation.

OAuth2 Got a token, now what?

OAuth2 is a framework remember. What does the framework tells me to do now?

Well, nothing. =P

It’s up to the Client developer.

She could (and often should):

check if token is still valid
look up for detailed information about who authorised this token
look up what are the permissions associated to that token
any other operation that it makes sense to finally give access to a resource

They are all valid, and pretty obvious points, right?
Does the developer have to figure out on her own the best set of operations to perform next?
She definitely can. Otherwise she can leverage another specification: OpenIDConnect(OIDC). More on this later.

OAuth2 - Implicit Grant Flow

It’s the flow designed for Client application that can’t keep a secret. An obvious example are client side HTML applications. But even any binary application whose code is exposed to the public can be manipulated to extract their secrets.
Couldn’t we have re-used the Authorisation Code Flow?
Yes, but… What’s the point of step 5) if secret is not a secure secret anymore? We don’t get any protection from that additional step!
So, Implicit Grant Flow, is just similar to Authorisation Code Flow, but it doesn’t perform that useless step 5.
It aims to obtain directly access_tokens without the intermediate step of obtaining a code first, that will be exchanged together with a secret, to obtain an access_token.

It uses response_type=token to specific which flow to use while contacting the Authorisation Server.
And also that there is no refresh_token. And this is because it’s assumed that user sessions will be short (due to the less secure environment) and that anyhow, the user will still be around to re-confirm his will to delegate(this was the main use case that lead to the definition of refresh_tokens).

OAuth2 - Client Credential Grant Flow

What if we don’t have a Resource Owner or if he’s indistinct from the Client software itself (1:1 relationship) ?
Imagine a backend system that just wants to talk to another backend system. No Users involved.
The main characteristic of such an interaction is that it’s no longer interactive, since we no longer have any user that is asked to confirm his will to delegate something.
It’s also implicitly defining a more secure environment, where you don’t have to be worried about active users risking to read secrets.

Its type is response_type=client_credentials.

We are not detailing it here, just be aware that it exist, and that just like the previous flow, it’s a variation, a simplification actually, of the full OAuth dance, that you are suggested to use if your scenario allows that.

OAuth2 - Resource Owner Credentials Grant Flow (aka Password Flow)

Please raise your attention here, because you are about to be confused.

This is the scenario:
The Resource Owner, has an account on the Authorisation Server. The Resource Owner gives his account details to the Client. The Client use this details to authenticate to the Authorisation Server…

If you have followed through the discussion you might be asking if I’m kidding you.
This is exactly the anti-pattern we tried to move away from at the beginning of our OAuth2 exploration!

How is it possible to find it listed here as possible suggested flow?

The answer is quite reasonable actually: It’s a possible first stop for migration from a legacy system.
And it’s actually a little better than the shared password antipattern:
The password is shared but that is just a mean to start the OAuth Dance used to obtain tokens.

This allows OAuth2 to put its foot into the door, if we don’t have better alternatives.
It introduces the concept of access_tokens, and it can be used until the architecture will be mature enough (or the environment will change) to allow a better and more secure Flow to obtain tokens.
Also, please notice that now tokens are the ad-hoc password that reaches the Protected Resource system, while in the fully shared password antipattern, it was our password that needs to be forwarded.

So, far from ideal, but at least we justified by some criteria.

How to chose the best flow?

There are many decision flow diagrams on the internet. One of those that I like the most is this one:

OAuth2 Flows from https://auth0.com

It should help you to remember the brief description I have gave you here and to chose the easiest flow based on your environment.

OAuth2 Back to tokens - JWT

So, we are able to get tokens now. We have multiple ways to get them. We have not been told explicitly what to do with them, but with some extra effort and a bunch of additional calls to the Authorisation Server we can arrange something and obtain useful information.

Could things be better?

For example, we have assumed so fare that our tokens might look like this:

{
   "access_token": "363tghjkiu6trfghjuytkyen",
   "token_type": "Bearer"
}

Could we have more information in it, so to save us some round-trip to the Authorisation Server?

Something like the following would be better:

{
  "active": true,
  "scope": "scope1 scope2 scope3",
  "client_id": "my-client-1",
  "username": "paolo",
  "iss": "http://keycloak:8080/",
  "exp": 1440538996,
  "roles" : ["admin", "people_manager"],
  "favourite_color": "maroon",
  ... : ...
}

We’d be able to access directly some information tied to the Resource Owner delegation.

Luckily someone else had the same idea, and they came out with JWT - JSON Web Tokens.
JWT is a standard to define the structure of JSON based tokens representing a set of claims. Exactly what we were looking for!

Actually the most important aspect that JWT spec gives us is not in the payload that we have exemplified above, but in the capability to trust the whole token without involving an Authorizatin Server!

How is that even possible? The idea is not a new one: asymmetric signing (pubkey), defined, in the context of JWT by JOSE specs.

Let me refresh this for you:

In asymmetric signing two keys are used to verify the validity of information.
These two keys are coupled, but one is secret, known only to the document creator, while the other is public.
The secret one is used to calculate a fingerprint of the document; an hash.
When the document is sent to destination, the reader uses the public key, associated with the secret one, to verify if the document and the fingerprint he has received are valid.
Digital signing algorithms tell us that the document is valid, according to the public key, only if it’s been signed by the corresponding secret key.

The overall idea is: if our local verification passes, we can be sure that the message has been published by the owner of the secret key, so it’s implicitly trusted.

And back to our tokens use case:

We receive a token. Can we trust this token? We verify the token locally, without the need to contact the issuer. If and only if, the verification based on the trusted public key passes, we confirm that token is valid. No question asked. If the token is valid according to digital signage AND if it’s alive according to its declared lifespan, we can take those information as true and we don’t need to ask for confirmation to the Authorisation Server!

As you can imagine, since we put all this trust in the token, it might be savvy not to emit token with an excessively long lifespan:
someone might have changed his delegation preferences on the Authorisation Server, and that information might not have reached the Client, that still has a valid and signed token it can based its decision onto.
Better to keep things a little more in sync, emitting tokens with a shorter life span, so, eventual outdated preferences don’t risk to be trusted for long periods.

OpenID Connect

I hope this section won’t disappoint you, but the article was already long and dense with information, so I’ll keep it short on purpose.

OAuth2 + JWT + JOSE ~= OpenID Connect

Once again: OAuth2 is a framework.
OAuth2 framework is used in conjunction with JWT specs, JOSE and other ideas we are not going to detail here, the create OpenID Connect specification.

The idea you should bring back is that, more often you are probably interested into using and leveraging OpenID Connect, since it puts together the best of the approaches and idea defined here.
You are, yes, leveraging OAuth2, but you are now the much more defined bounds of OpenID Connect, that gives you richer tokens and support for Authentication, that was never covered by plain OAuth2.

Some of the online services offer you to chose between OAuth2 or OpenID Connect. Why is that?
Well, when they mention OpenID Connect, you know that you are using a standard. Something that will behave the same way, even if you switch implementation.
The OAuth2 option you are given, is probably something very similar, potentially with some killer feature that you might be interested into, but custom built on top of the more generic OAuth2 framework.
So be cautious with your choice.

Conclusion

If you are interested into this topic, or if this article has only confused you more, I suggest you to check OAuth 2 in Action by Justin Richer and Antonio Sanso.
On the other side, if you want to check your fresh knowledge and you want to try to apply it to an open source Authorisation Server, I will definitely recommend playing with Keycloak that is capable of everything that we have described here and much more!

Someday Never Comes

Thursday, June 29, 2017

OAuth2, JWT, Open-ID Connect and other confusing things