Someday Never Comes: 2017

Tuesday, September 26, 2017

Ansible - A handy tool for people that might not need it

tl;dr

Ansible is a good tool. You can do what it gives you in many different ways, and you might be already doing it. With this assumption, I’ll try to show you why you might like it.

Ansible and I

I have recently passed Red Hat Certificate of Expertise in Ansible Automation.

I have taken the exam because I felt that I didn’t know much about Ansible and since the topic was hot with new modules and integration popping out on a daily basis.
Each time I can find a way to match a new learning opportunity with the possibility to verify my understanding of a topic against a Red Hat formal exam, I take a chance. That’s how I got that nice t-shirt when I got my RHCA =P

What’s Ansible?

Ansible is a configuration management and deployment devops tool.

Let’s break this apart, starting from the end: I haven’t thrown DevOps here just as a catchy buzzword, but I cited it just to give you a context. We are in the DevOps world here. Somewhere in between system administration, operations and developers’ deliverables.
And we are dealing with configuration management, meaning that we probably have quite a large matrix of specific configuration that has to match with an equivalently complex matrix of the environment to apply it to.
And it seems that we are allowed to use the tool also to PUT deliverables and configuration on the target machines we are interested to manage.

So far so good.

Just to check if you are still with me, I’ll challenge you with a provocative question:

Tell me the name of a technology, out there since forever, that you have used to perform the same task in the past.

I’m not pretending to guess everyone’s answer, but my hope is that a large portion of readers would reply with any variation of this: scripts or bash.

If you also replied that, we are definitely on the same page and I can assure that I’ll try to keep this comparison in mind in this blog post, hoping that this can be a useful way to tackle this topic.

Alternatives

If you work in IT, at any level probably, you must have some familiarity with any of the alternative tools to perform a similar job.
The usual suspects are Chef, Puppet, Salt and bash/scripts for Configuration Management and Capistrano, Fabric and bash/scripts for Deployment.
Again, I’m stressing out the custom scripts part here. Because, besides the fact that anything that brings with it the custom adjective in the name, is a possible code smell in an industry that aims to improve via standardization, it’s also implicitly suggesting that if your software management model is mature enough to give you everything that you need, you are probably already where you want to be.

A specific characteristic that distinguishes Ansible from most of its alternatives is the fact that it has a agentless architecture.
This means that you won’t have a daemon or equivalent process, always running on the managed system, listening for a command from a central server to perform any operation.

But… you will have to rely on a daemon, always running, to allow your managed node to perform the operations you want.
Is this a fraud?
No, but I like to draw out the pros and cons of things, without cheating and hiding behind too good to be true marketing claims.
For Ansible, to be able to work in its agentless way, you have to rely on an SSHD (or equivalent, when you are outside *NIX*) daemon to allow remote connections.
So it’s not a fraud. SSHD always running process is often found in many systems, even when you don’t use Ansible to manage them. But at the same time you cannot say that if you don’t have an agent running on a managed node, you don’t have anything else running in its place!

Is this all? Ansible press says yes, I personally say no.

Uh? The story out there uses to cite the refrain that SSH it’s the only thing you need to run Ansible. I feel that this is not completely correct, and if you take it as an absolute assertion it’s just plain wrong:

Besides sshd, Ansible needs python installed on the managed host. This might be overlooked if you manage just nodes based on modern Linux distros. But in recent times, when you might find yourself managing lower level devices like IoT ones, that try to reduce the number of packages installed, you might end up without python. In those cases, Ansible wouldn’t work.
In the specific case of local deployment, you don’t even need SSHD! I got this is a specific use case, but I think it’s an important one. I might want to use Ansible as a full replacement on a large bash script for example. Bash doesn’t assume ssh. It’s mainly a collection of command invocation tied together with execution logic. Ansible can be that for you. And if that’s the case, you don’t even need sshd running locally, since it’s implicit support for localhost allows you to run Ansible script anyway.

Why python?

This is an important question to understand how Ansible operates:

Ansible allows you to express scripts, that it calls playbooks, written with a custom YAML based syntax/dialect.

Why python then?
Because the playbook written in YAML is not the artifact that it’s going to be run. It’s just used as an intermediate artifact that is preprocessed by Ansible toolkit, to produce a fully valid python program, performing the instructions expressed in the higher level YAML syntax.
This python program is then copied to the managed node, and run. You now get how come python was a strict requirement for the managed nodes.

The DSL

Ansible uses a YAML based DSL for its script.
This is one of those points where I have mixed feelings: in general, I’m pro Domain Specific Languages. The expectation is that they are going to offer facilities at language level to perform the most common tasks in the Domain in an easier way. The main reason for my mixed feelings is that I have quite a bad memory for language syntaxes. I can make things working with a sample to look at or the internet, but I’m never sure what’s the language-specific convention for functions, parenthesis and such.

In the case of Ansible, and considering that the main alternative is bash that is not really intuitive for anything that relates to formatting, semicolons, control structures and string manipulation, I can honestly say that the DSL idea works well.

This is the skeleton of an Ansible playbook:

---
- name: My Playbook
  hosts: host1,host2
  tasks:
  - name: say hello
    debug:
      msg: "Hello world"
  - name: copy file
    copy:
      src: /path/to/source
      dest: /path/to/dest
...

And I swear I have typed it without copying it.
As you can see, is reasonably intuitive.
As you can imagine someone needs to tell you the list of allowed keywords that you can use. And you have to know what’s mandatory or optional, but you can probably read it and start setting your expectations.

If you are like me (or Alan Kay), you are probably mumbling already that yeah, Ansible looks simple to perform simple tasks, but I haven’t proved how to fit it is for advanced stuff. Just be patient for a little longer, since I’ll come back on this later.

The structure of the sample script above covers a large part of what you need to know about Ansible. There are 2 key elements:

hosts
tasks

Not surprisingly you have a list of hosts. You might have written a similar bash script with hostnames externalized in their own variable, but these are indeed more complex than it looks. They are actually not a plain list of addresses, but they are keys in the dictionary that Ansible uses to keep track of managed hosts. You may wonder why all this complexity?

There are 2 main reasons:

The first one is that this way, having a complex object/dictionary tied to each managed host, you can also attach a lot of other data and metadata to the entry. You can have a place to specify users, connection parameters and such; up to custom variables, you want to assign a special value just to the specific node.
The second one instead, is less straightforward but actually harder to implement if things weren’t broken down this way: this decoupled mechanism, allows you to plug in easily dynamic data for the collection of hosts, that Ansible calls inventory.
In ansible, an inventory file can be a static file with the lists of hosts and their properties, or it can be any binary file, that when invoked with special parameters defined by a service contract, returns a JSON object containing the same information you can have with a static file. If this seems overkill to you, just try to think about how things are less static nowadays with virtualization and cloud environments.

Now the tasks part.
This is the easy part because something else encapsulates the harder aspects.
tasks is just a list of individual operation to perform.
As simple as that.
The operations are provided to you by a large set of standard library modules. In our example, we use 2 very simple ones, debug used just to print out strings, and copy that as you can guess, just copies a file.
This is probably the aspect where Ansible shines the most: it has a massive set of off the shelf components that you can start using just passing them the required configuration.
The list here is really long. You can check it yourself with ansible-doc a companion CLI tool, that is your best friend, even more than Google probably when you are working on a script:
ansible-doc -l gives you the full list of modules currently supported by Ansible, from commands to configure enterprise load balancers to other to configure and deploy AWS Lambdas. Literally everything for everyone!

ansible-doc is super useful because the style of the documentation it allows you to browse has been kept the light on purpose, with a large focus on the examples.

For example, if you inspect the documentation of the get_url module, and jump to the EXAMPLES section, you’ll probably find the snippet to start customizing the command for your specific needs:
ex.

ansible-doc copy
...
EXAMPLES:
- name: download foo.conf
  get_url:
    url: http://example.com/path/file.conf
    dest: /etc/foo.conf
    mode: 0440
...

A large standard library + a light inline documentation is most likely something your custom bash scripts cannot offer you.

Up until this point, if you are skeptics like me, you would probably have told me that you appreciate it but that other than a particularly large module library and excellent documentation you don’t see any compelling reason to move my consolidated practices to a new tool.

And I agree.

So I’ll weight in some further feature.

ansible-playbook, that is the CLI command you pass your scripts to be invoked, has some built-in facility, that once again, is really handy to the person that is writing scripts and has to debug it.

For example, you can pass it a flag where the Ansible prompt will ask you if you really want to run a specific task or if you want to skip it. (--step).
Otherwise, you might want to use a flag to provide the name of the specific task on your list to start the execution from: --start-at-task=.
Or you might want to run only tasks tagged with a specific marker: --tags=.
You even have a flag to run the script in test mode, so that no real operation is really invoked but, when possible and it makes sense, get back an output that would tell you what effect your command invocation would have generated on the remote system!

These (and many others I don’t have time to list here) are definitely nice facilities to anyone that has to write a script, in any language.

Another important feature/facility that it’s important to underline is the fact that Ansible modules operations are oriented to be idempotent.
This means that they are written in such a way that if you perform the same exact operation more than once, it will leave the system in the same state as after the first invocation.

This is probably better explained with an example:
Imagine that the operation you have to perform is to put a specific line in a file.
A naive way to obtain this in bash is with concatenation. If I’d just use the >> operator, I’m sure I will always add my entry to that file.
But what happens if the entry was already there? If concatenate a second time, I would add a second entry. And this is unlikely what I need to do most of the time.
To correctly solve my issue, I should append the entry only if it wasn’t there before.
It’s still not super complex, but I definitely have to keep more things in mind.

Ansible modules try to encapsulate this “idempotency” responsibility: you just declare the line of text you want in a file. It’s up to the module verify that it won’t be added a second time if already present.

Another way to explain this approach is that Ansible syntax is declarative. You just need to define the final state of the system you are interacting with, not really the complete logic performed to get to that state.

This nifty feature allows you to run the same script against the same node more than once, without the risk of ruining the system configuration. Which allows you to avoid disrupting a healthy system if a script is run for twice by error, but also to speed up things a lot in many cases: imagine a complex script that downloads files, extract them, install and so on… If a system is already in the final state declared by your script, you can avoid performing redundant steps!

Idempotency is also important to understand ansible-playbook output.

At the end of an invocation of a script, you’ll get a recap similar to this:

PLAY RECAP ****************************************************************************
host1                  : ok=1    changed=1    unreachable=0    failed=0

This summary recaps what happened in the scripts. You also have the full transcript in the above lines I’m not showing here, but I want to focus on the recap.

The recap is strictly tied to the idempotency idea: the number of tasks in the ok column is those that HAVE NOT altered the system since it was already in the desired state!
Those that have instead performed some work are listed in the changed column.

Some built-in construct like handlers and also the logic you might want to use in your scripts are built around the concept of changed: do other stuff only if you a specific task has changed the system, otherwise probably don’t need to (since the system hasn’t changed).

Advanced stuff

Declaimer: this is not really advanced stuff. But it’s that kind of stuff, that in bash you often rely on documentation or examples to verify how it’s performed.

Let’s start with the setup module:
Ansible default behavior, unless you ask to turn it off, is to run, as the very first module that you don’t have to define explicitly, a discovery command called setup. This command gathers a lot of useful information regarding the system you are connecting to, and use them to populate a set of variables you can use in your playbook.
It’s a handy way to have a quick access to information like IP addresses, memory availability and many other data or metrics you might base your logic onto.

What about loops?.

Bash has that bizarre rule of carriage return vs. semicolon, that makes me unable to remember how to write one without checking the internet.

In Ansible is simpler:

...
- name: sample loop, using debug but works with any module
  debug:
    msg: {{ item }}
  with_items:
  - "myFirstItem"
  - "mySecondItem"
...

There is a collection of with_* keywords you can use to iterate over specific objects, in our case a plain YAML list, but it could be a dictionary, a list of files names with support for GLOB and many others.

Conditional flow
There is strict equivalent for an if/else logic. At least not to create a traditional execution tree.
The patter here is to push you to keep the execution logic linear, with just a boolean flag that enables or disable the specific task.

And it’s as simple as writing:

...
when: true
...

And you guess right if you think that any expression that evaluates to true can specified there, something like: ansible_os_family == "RedHat" and ansible_lsb.major_release|int >= 6.

Don’t be disappointed by my comment on the fact that Ansible doesn’t suggest you have nested if/else logic. You can still get that with the combination of other keywords like block and when. But the point is to try to keep the execution flow linear, thus simpler.

The last advanced feature I want to cover is Ansible templates and string interpolation.

The tool has an embedded mechanism to allow you to build dynamic strings. This allows you to define a basic skeleton for a text where you define some placeholders, that are automatically replaced by the values of the corresponding variables that are present in the running context.

For example:

debug:
  msg: "Hello {{ inventory_hostname }}"

The same feature is also available in the form of template module: you can define an external file, containing your template, that Ansible will process applying values for the tokens it finds, and in a single pass, copy it to the remote system. This, is an ideal use case for configuration, for example:

template:
  src: /mytemplates/foo.j2
  dest: /etc/file.conf
  owner: bin
  mode: 0644

I decide to stop my overview here, despite Ansible offers you many more features you expect from a DevOps tool:

and many others!

My hope is that this article could trigger your curiosity to explore the topic on your own, having shown that despite you are probably able to obtain the same effect with other technologies, Ansible proposition is interesting and might deserve a deeper look.

Thursday, June 29, 2017

OAuth2, JWT, Open-ID Connect and other confusing things

Disclaimer

If feel I have to start this post with an important disclaimer: don’t trust too much what I’m about to say.
The reason why I say this is because we are discussing security. And when you talk about security anything other then 100% correct statements risks to expose you to some risk of any sort.
So, please, read this article keeping in mind that your source of truth should be the official specifications, and that this is just an overview that I use to recap this topic in my own head and to introduce it to beginners.

Mission

I have decided to write this post because I have always found OAuth2 confusing. Even now that I know a little more about it, I found some of its part puzzling.
Even if I was able to follow online tutorials from the likes of Google or Pinterest when I need to fiddle with their APIs, it always felt like some sort of voodoo, with all those codes and Bearer tokens.
And each time they mentioned I could make my own decisions for specific steps, choosing among the standard OAuth2 approach, my mind tended to go blind.

I hope I’ll be able to fix some idea, so that from now on, you will be able to follow OAuth2 tutorials with more confidence.

What is OAuth2?

Let’s start from the definition:

OAuth 2 is an authorisation framework that enables applications to obtain limited access to user accounts on an HTTP service.

The above sentence is reasonably understandable , but we can improve things if we pinpoint the chose terms.

The Auth part of the name, reveals itself to be Authorisation(it could have been Authentication; it’s not).
Framework can be easily overlooked since the term framework is often abused; but the idea to keep here is that it’s not necessarily a final product or something entirely defined. It’s a toolset. A collection of ideas, approaches, well defined interactions that you can use to build something on top of it!
It enable applications to obtain limited access. The key here is that it enables applications not humans.
limited access to user accounts is probably the key part of the definition that can help you to remember and to explain what OAuth2 is:
the main aim is to allow a user to delegate access to a user owned resource. Delegating it to an application.

OAuth2 is about delegation.

It’s about a human, instructing a software to do something on her behalf.
The definition also mentions limited access, so you can imagine of being able to delegate just part of your capabilities.
And it concludes mentioning HTTP services. This authorisation-delegation, happens on an HTTP service.

Delegation before OAuth2

Now that the context should be clearer, we could ask ourselves: How were things done before OAuth2 and similar concepts came out?

Well, most of the time, it was as bad as you can guess: with a shared secret.

If I wanted a software A to be granted access to my stuff on server B, most of the time the approach was to give my user/pass to software A, so that it could use it on my behalf.
This is still a pattern you can see in many modern software, and I personally hope it’s something that makes you uncomfortable.
You know what they say: if you share a secret, it’s no longer a secret!

Now imagine if you could instead create a new admin/password couple for each service you need to share something with. Let’s call them ad-hoc passwords.
They are something different than your main account for a specific service but they still allow to access the same service as they were you. You would be able, in this case, to delegate, but you would still be responsible of keeping track of all this new application-only accounts you need to create.

OAuth2 - Idea

Keeping in mind that the business problem that we are trying to solve is the “delegation” one, we want to extend the ad-hoc password idea to take away from the user the burden of managing these ad-hoc passwords.
OAuth2 calls these ad-hoc passwords tokens.
Tokens, are actually more than that, and I’ll try to illustrate it, but it might be useful to associate them to this simpler idea of an ad-hoc password to begin with.

OAuth2 - Core Business

Oauth 2 Core Business is about:

how to obtain tokens

OAuth2 - What’s a token?

Since everything seems to focus around tokens, what’s a token?
We have already used the analogy of the ad-hoc password, that served us well so far, but maybe we can do better.
What if we look for the answer inside OAuth2 specs?
Well, prepare to be disappointed. OAuth2 specs do not give you the details of how to define a token. Why is this even possible?
Remember when we said that OAuth2 was “just a framework”? Well, this is one of those situation where that definition matters!
Specs just tell you the logical definition of what a token is and describe some of the capabilities it needs to posses.
But at the end, what specs say is that a token is a string. A string containing credentials to access a resource.
It gives some more detail, but it can be said that most of the time, it’s not really important what’s in a token. As long as the application is able to consume them.

A token is that thing, that allows an application to access the resource you are interested into.

To point out how you can avoid to overthink what a token is, specs also explicitly say that “is usually opaque to the client”!
They are practically telling you that you are not even required to understand them!
Less things to keep in mind, doesn’t sound bad!

But to avoid turning this into a pure philosophy lesson, let’s show what a token could be

{
   "access_token": "363tghjkiu6trfghjuytkyen",
   "token_type": "Bearer"
}

A quick glimpse show us that, yeah, it’s a string. JSON-like, but that’s probably just because json is popular recently, not necessarily a requirement.
We can spot a section with what looks like a random string, an id: 363tghjkiu6trfghjuytkyen. Programmers know that when you see something like this, at least when the string is not too long, it’s probably a sign that it’s just a key that you can correlate with more detailed information, stored somewhere else.
And that iss true also in this case.
More specifically, the additional information it will be the details about the specific authorisation that that code is representing.

But then another thing should capture your attention: "token_type": "Bearer".

Your reasonable questions should be: what are the characteristics of a Bearer token type? Are there other types? Which ones?

Luckily for our efforts to keep things simple, the answer is easy ( some may say, so easy to be confusing… )

Specs only talk about Bearer token type!

Uh, so why the person who designed a token this way, felt that he had to specify the only known value?
You might start seeing a pattern here: because OAuth2 is just a framework!
It suggests you how to do things, and it does some of the heavy lifting for you making some choice, but at the end, you are responsible of using the framework to build what you want.
We are just saying that, despite here we only talk about Bearer tokens, it doesn’t mean that you can’t define your custom type, with a meaning you are allowed to attribute to it.

Okay, just a single type. But that is a curious name. Does the name imply anything relevant?
Maybe this is a silly question, but for non-native English speakers like me, what Bearer means in this case could be slightly confusing.

Its meaning is quite simple actually:

A Bearer token is something that if you have a valid token, we trust you. No questions asked.

So simple it’s confusing. You might be arguing: “well, all the token-like objects in real world work that way: if I have valid money, you exchange them for the good you sell”.

Correct. That’s a valid example of a Bearer Token.

But not every token is of kind Bearer. A flight ticket, for example, it’s not a Bearer token.
It’s not enough having a ticket to be allowed to board on a plane. You also need to show a valid ID, so that your ticket can be matched with; and if your name matches with the ticket, and your face match with the id card, you are allowed to get on board.

To wrap this up, we are working with a kind of tokens, that if you posses one of them, that’s enough to get access to a resource.

And to keep you thinking: we said that OAuth2 is about delegation. Tokens with this characteristic are clearly handy if you want to pass them to someone to delegate.

A token analogy

Once again, this might be my non-native English speaker background that suggests me to clarify it.
When I look up for the first translation of token in Italian, my first language, I’m pointed to a physical object.
Something like this:

Token

That, specifically, is an old token, used to make phone calls in public telephone booths.
Despite being a Bearer token, its analogy with the OAuth2 tokens is quite poor.
A much better picture has been designed by Tim Bray, in this old post: An Hotel Key is an Access Token
I suggest you to read directly the article, but the main idea, is that compared to the physical metal coin that I have linked first, your software token is something that can have a lifespan, can be disabled remotely and can carry information.

Actors involved

These are our actors:

Resource Owner
Client (aka Application)
Authorisation Server
Protected Resource

It should be relatively intuitive: an Application wants to access a Protected Resource owned by a Resource Owner. To do so, it requires a token. Tokens are emitted by an Authorisation Server, which is a third party entity that all the other actors trust.

Usually, when a read something new, I tend to quickly skip through the actors of a system. Probably I shouldn’t, but most of the time, the paragraph that talks describe, for example, a “User”, ends up using many words to just tell me that it’s just, well, a user… So I try to look for the terms that are less intuitive and check if some of them has some own characteristic that I should pay particular attention to.

In OAuth2 specific case, I feel that the actor with the most confusing name is Client.
Why do I say so? Because, in normal life (and in IT), it can mean many different things: a user, a specialised software, a very generic software…

I prefer to classify it in my mind as Application.

Stressing out that the Client is the Application we want to delegate our permissions to. So, if the Application is, for example, a server side web application we access via a browser, the Client is not the user or the browser itself: the client is the web application running in its own environment.

I think this is very important. Client term is all over the place, so my suggestion is not to replace it entirely, but to force your brain to keep in mind the relationship Client = Application.

I also like to think that there is another not official Actor: the User-Agent.

I hope I won’t confuse people here, because this is entirely something that I use to build my mental map.
Despite not being defined in the specs, and also not being present in all the different flows, it can help to identify this fifth Actor in OAuth2 flows.
The User-Agent is most of the time impersonated by the Web Browser. Its responsibility is to enable an indirect propagation of information between 2 systems that are not talking directly each other.
The idea is: A should talk to B, but it’s not allowed to do so. So A tells C (the User-Agent) to tell B something.

It might be still a little confusing at the moment, but I hope I’ll be able to clarify this later.

OAuth2 Core Business 2

OAuth2 is about how to obtain tokens.

Even if you are not an expert on OAuth2, as soon as someone mentions the topic, you might immediately think about those pages from Google or the other major service providers, that pop out when you try to login to a new service on which you don’t have an account yet, and tell Google, that yeah, you trust that service, and that you want to delegate some of your permissions you have on Google to that service.

This is correct, but this is just one of the multiple possibly interactions that OAuth2 defines.

There are 4 main ones it’s important you know. And this might come as a surprise if it’s the first time you hear it:
not all of them will end up showing you the Google-like permissions screen!
That’s because you might want to leverage OAuth2 approach even from a command line tool; maybe even without any UI at all, capable of displaying you an interactive web page to delegate permissions.

Remember once again: the main goal is to obtain tokens!

If you find a way to obtain one, the “how” part, and you are able to use them, you are done.

As we were saying, there are 4 ways defined by the OAuth2 framework. Some times they are called flows, sometimes they are called grants.
It doesn’t really matter how you call them. I personally use flow since it helps me reminding that they differ one from the other for the interactions you have to perform with the different actors to obtain tokens.

They are:

Authorisation Code Flow
Implicit Grant Flow
Client Credential Grant Flow
Resource Owner Credentials Grant Flow (aka Password Flow)

Each one of them, is the suggested flow for specific scenarios.
To give you an intuitive example, there are situation where your Client is able to keep a secret(a server side web application) and other where it technically can’t (a client side web application you can entirely inspect it’s code with a browser).
Environmental constraints like the one just described would make insecure ( and useless ) some of the steps defined in the full flow. So, to keep it simpler, other flows have been defined when some of the interactions that were impossible or that were not adding any security related value, have been entirely skipped.

OAuth2 Poster Boy: Authorisation Code Flow

We will start our discussion with Authorisation Code Flow for three reasons:

it’s the most famous flow, and the one that you might have already interacted with (it’s the Google-like delegation screen one)
it’s the most complex, articulated and inherently secure
the other flows are easier to reason about, when compared to this one

The Authorisation Code Flow, is the one you should use if your Client is trusted and is able to keep a secret. This means a server side web application.

How to get a token with Authorisation Code Flow

All the involved Actors trust the Authorisation Server
User(Resource Owner) tells a Client(Application) to do something on his behalf
Client redirects the User to an Authorisation Server, adding some parameters: redirect_uri, response_type=code, scope, client_id
Authorisation Server asks the User if he wishes to grant Client access some resource on his behalf(delegation) with specific permissions(scope).
User accepts the delegation request, so the Auth Server sends now an instruction to the User-Agent(Browser), to redirect to the url of the Client. It also injects a code=xxxxx into this HTTP Redirect instruction.
Client, that has been activated by the User-Agent thanks to the HTTP Redirect, now talks directly to the Authorisation Server (bypassing the User-Agent). client_id, client_secret and code(that it had been forwarded).
Authorisation Server returns the Client (not the browser) a valid access_token and a refresh_token

This is so articulated that it’s also called the OAuth2 dance!

Let’s underline a couple of points:

At step 2, we specify, among the other params, a redirect_uri. This is used to implement that indirect communication we anticipated when we have introduced the User-Agent as one of the actors. It’s a key information if we want to allow the Authorisation Server to forward information to the Client without a direct network connection open between the two.
the scope mentioned at step 2 is the set of permissions the Client is asking for
Remember that this is the flow you use when the client is entirely secured. It’s relevant in this flow at step 5, when the communication between the Client and the Authorisation Server, avoids to pass through the less secure User-Agent (that could sniff or tamper the communication). This is also why, it makes sense that for the Client to enable even more security, that is to send its client_secret, that is shared only between him and the Authorisation Server.
The refresh_token is used for subsequent automated calls the Client might need to perform to the Authorisation Server. When the current access_token expires and it needs to get a new one, sending a valid refresh_token allows to avoid asking the User again to confirm the delegation.

OAuth2 Got a token, now what?

OAuth2 is a framework remember. What does the framework tells me to do now?

Well, nothing. =P

It’s up to the Client developer.

She could (and often should):

check if token is still valid
look up for detailed information about who authorised this token
look up what are the permissions associated to that token
any other operation that it makes sense to finally give access to a resource

They are all valid, and pretty obvious points, right?
Does the developer have to figure out on her own the best set of operations to perform next?
She definitely can. Otherwise she can leverage another specification: OpenIDConnect(OIDC). More on this later.

OAuth2 - Implicit Grant Flow

It’s the flow designed for Client application that can’t keep a secret. An obvious example are client side HTML applications. But even any binary application whose code is exposed to the public can be manipulated to extract their secrets.
Couldn’t we have re-used the Authorisation Code Flow?
Yes, but… What’s the point of step 5) if secret is not a secure secret anymore? We don’t get any protection from that additional step!
So, Implicit Grant Flow, is just similar to Authorisation Code Flow, but it doesn’t perform that useless step 5.
It aims to obtain directly access_tokens without the intermediate step of obtaining a code first, that will be exchanged together with a secret, to obtain an access_token.

It uses response_type=token to specific which flow to use while contacting the Authorisation Server.
And also that there is no refresh_token. And this is because it’s assumed that user sessions will be short (due to the less secure environment) and that anyhow, the user will still be around to re-confirm his will to delegate(this was the main use case that lead to the definition of refresh_tokens).

OAuth2 - Client Credential Grant Flow

What if we don’t have a Resource Owner or if he’s indistinct from the Client software itself (1:1 relationship) ?
Imagine a backend system that just wants to talk to another backend system. No Users involved.
The main characteristic of such an interaction is that it’s no longer interactive, since we no longer have any user that is asked to confirm his will to delegate something.
It’s also implicitly defining a more secure environment, where you don’t have to be worried about active users risking to read secrets.

Its type is response_type=client_credentials.

We are not detailing it here, just be aware that it exist, and that just like the previous flow, it’s a variation, a simplification actually, of the full OAuth dance, that you are suggested to use if your scenario allows that.

OAuth2 - Resource Owner Credentials Grant Flow (aka Password Flow)

Please raise your attention here, because you are about to be confused.

This is the scenario:
The Resource Owner, has an account on the Authorisation Server. The Resource Owner gives his account details to the Client. The Client use this details to authenticate to the Authorisation Server…

If you have followed through the discussion you might be asking if I’m kidding you.
This is exactly the anti-pattern we tried to move away from at the beginning of our OAuth2 exploration!

How is it possible to find it listed here as possible suggested flow?

The answer is quite reasonable actually: It’s a possible first stop for migration from a legacy system.
And it’s actually a little better than the shared password antipattern:
The password is shared but that is just a mean to start the OAuth Dance used to obtain tokens.

This allows OAuth2 to put its foot into the door, if we don’t have better alternatives.
It introduces the concept of access_tokens, and it can be used until the architecture will be mature enough (or the environment will change) to allow a better and more secure Flow to obtain tokens.
Also, please notice that now tokens are the ad-hoc password that reaches the Protected Resource system, while in the fully shared password antipattern, it was our password that needs to be forwarded.

So, far from ideal, but at least we justified by some criteria.

How to chose the best flow?

There are many decision flow diagrams on the internet. One of those that I like the most is this one:

OAuth2 Flows from https://auth0.com

It should help you to remember the brief description I have gave you here and to chose the easiest flow based on your environment.

OAuth2 Back to tokens - JWT

So, we are able to get tokens now. We have multiple ways to get them. We have not been told explicitly what to do with them, but with some extra effort and a bunch of additional calls to the Authorisation Server we can arrange something and obtain useful information.

Could things be better?

For example, we have assumed so fare that our tokens might look like this:

{
   "access_token": "363tghjkiu6trfghjuytkyen",
   "token_type": "Bearer"
}

Could we have more information in it, so to save us some round-trip to the Authorisation Server?

Something like the following would be better:

{
  "active": true,
  "scope": "scope1 scope2 scope3",
  "client_id": "my-client-1",
  "username": "paolo",
  "iss": "http://keycloak:8080/",
  "exp": 1440538996,
  "roles" : ["admin", "people_manager"],
  "favourite_color": "maroon",
  ... : ...
}

We’d be able to access directly some information tied to the Resource Owner delegation.

Luckily someone else had the same idea, and they came out with JWT - JSON Web Tokens.
JWT is a standard to define the structure of JSON based tokens representing a set of claims. Exactly what we were looking for!

Actually the most important aspect that JWT spec gives us is not in the payload that we have exemplified above, but in the capability to trust the whole token without involving an Authorizatin Server!

How is that even possible? The idea is not a new one: asymmetric signing (pubkey), defined, in the context of JWT by JOSE specs.

Let me refresh this for you:

In asymmetric signing two keys are used to verify the validity of information.
These two keys are coupled, but one is secret, known only to the document creator, while the other is public.
The secret one is used to calculate a fingerprint of the document; an hash.
When the document is sent to destination, the reader uses the public key, associated with the secret one, to verify if the document and the fingerprint he has received are valid.
Digital signing algorithms tell us that the document is valid, according to the public key, only if it’s been signed by the corresponding secret key.

The overall idea is: if our local verification passes, we can be sure that the message has been published by the owner of the secret key, so it’s implicitly trusted.

And back to our tokens use case:

We receive a token. Can we trust this token? We verify the token locally, without the need to contact the issuer. If and only if, the verification based on the trusted public key passes, we confirm that token is valid. No question asked. If the token is valid according to digital signage AND if it’s alive according to its declared lifespan, we can take those information as true and we don’t need to ask for confirmation to the Authorisation Server!

As you can imagine, since we put all this trust in the token, it might be savvy not to emit token with an excessively long lifespan:
someone might have changed his delegation preferences on the Authorisation Server, and that information might not have reached the Client, that still has a valid and signed token it can based its decision onto.
Better to keep things a little more in sync, emitting tokens with a shorter life span, so, eventual outdated preferences don’t risk to be trusted for long periods.

OpenID Connect

I hope this section won’t disappoint you, but the article was already long and dense with information, so I’ll keep it short on purpose.

OAuth2 + JWT + JOSE ~= OpenID Connect

Once again: OAuth2 is a framework.
OAuth2 framework is used in conjunction with JWT specs, JOSE and other ideas we are not going to detail here, the create OpenID Connect specification.

The idea you should bring back is that, more often you are probably interested into using and leveraging OpenID Connect, since it puts together the best of the approaches and idea defined here.
You are, yes, leveraging OAuth2, but you are now the much more defined bounds of OpenID Connect, that gives you richer tokens and support for Authentication, that was never covered by plain OAuth2.

Some of the online services offer you to chose between OAuth2 or OpenID Connect. Why is that?
Well, when they mention OpenID Connect, you know that you are using a standard. Something that will behave the same way, even if you switch implementation.
The OAuth2 option you are given, is probably something very similar, potentially with some killer feature that you might be interested into, but custom built on top of the more generic OAuth2 framework.
So be cautious with your choice.

Conclusion

If you are interested into this topic, or if this article has only confused you more, I suggest you to check OAuth 2 in Action by Justin Richer and Antonio Sanso.
On the other side, if you want to check your fresh knowledge and you want to try to apply it to an open source Authorisation Server, I will definitely recommend playing with Keycloak that is capable of everything that we have described here and much more!