Perils of automatic invisible software updates

A couple of days ago I noticed my Google Fiber uplink was only 100Mbps instead of 1Gbps. While debugging with the tech via phone, it seemed as though any time I had something connected to port 4 on the network box, the speed dropped. I moved cables around until the speed was back to normal, and they scheduled a tech to come swap out the box.

Today the tech arrived and was unable to reproduce the problem. Even with port 4 connected to the switch — the highest traffic network segment — everything was normal.

Finally, he checked the network maintenance logs. It turned out that I just happened to run a speed test during the exact time period when Google was pushing out a firmware update to my network box. It probably dropped the link to 100Mbps while it was flashing the firmware and rebooting. The fiber UI is very barebones and doesn’t show that information at all, so I had no way to know that.

Making details invisible to users is great, until the moment when things don’t work as expected.

OpenID Connect introduction

OpenID Connect is an authentication standard built on top of OAuth2. From my point of view it has the following key features:

  1. It’s a lot simpler than anything involving SAML. Validating SAML requires a full implementation of XML Signature, which requires an implementation of XML Canonicalization, which requires a full XPath implementation. I’m not anti-XML in general, but I don’t think authenticating a user should require parsing, traversing and rearranging a DOM tree multiple times.

  2. It’s more secure than OAuth2, as we’ll see below.

  3. It can be implemented in situations where your web application can only reach the authentication provider via HTTPS — you don’t need to be able to make LDAP or Active Directory connections.

  4. It’s an open standard.

The downsides of OpenID Connect?

  1. It’s not very popular. Most Internet authentication providers seem to have rolled their own systems based on OAuth2.

  2. It’s kind of a pain in the ass to implement.

  3. The documentation is long and somewhat unapproachable.

This is a summary of the key information I wish someone had given me before I tried to make sense of OpenID Connect.

Endpoints

An OpenID Connect authentication provider has a set of four key endpoints:

  1. An authentication endpoint. You direct the user’s browser to this via an HTTP 301 redirect in order to start the authentication process. The user is bounced back to your web application after logging in, with parameters added to that HTTP request. The parameters may be a code, an ID token, an access token, or some combination of the three.

  2. A token endpoint. This is a REST API your application can use to obtain an id token or access token, given a code.

  3. A user info endpoint. This is a REST API your application can use to obtain information about a user, given an access token. Typically it’s used to get information that doesn’t fit inside the ID token, such as the user’s avatar image. It also may not be supported, as it’s totally optional.

  4. An introspection endpoint. This is a REST API you can pass an access token to. It will return information about the token, such as whether it’s valid, when it will expire, and so on.

There are other optional endpoints, but those are the most important.

Key data

As mentioned in the above descriptions, OpenID Connect involves passing around three key pieces of information:

  • An id_token is what you typically want — key information about the logged-in user’s ID and their name.

  • The access_token (or just token in some places) is a token you can use to connect to the token endpoint, userinfo endpoint or introspection endpoint. It therefore allows you to obtain an ID token, get more detailed information about the user, or find out when the granted authentication will expire.

  • A code is a one-time code you can use, accompanied by a secret ID and password you’ve pre-arranged with the authentication provider, to connect to the token endpoint and get a token.

Given the above, there are several ways to do OpenID Connect.

Implicit flow

With Implicit Flow you bounce the user to the authentication endpoint, and when they return you’re passed an ID token (and optionally, an access token so you can look up information that doesn’t fit in the ID token).

The advantage of implicit flow is that it’s really easy to implement. The disadvantage is that it’s not very secure.

For example, browser malware such as a malicious extension can sniff the tokens from the response. Because the ID token is a standard format, it can collect the user’s account information. The stolen access token can be used with the authentication provider’s APIs as well.

There are some ways to mitigate the risk somewhat, but a much better option is…

Authorization flow

With Authorization Flow you bounce the user to the authentication endpoint; but when they return, all your app gets is an opaque code. You then need to make an authenticated REST call to the token endpoint, using an ID and password (secret) you obtained by registering with the authentication provider.

Because the tokens are not obtained via the browser, you can avoid being open to token stealing via browser malware. In addition, the application’s ID and secret are never given to the browser, and without those the browser malware can’t obtain tokens from the token endpoint.

The obvious initial downside is that your web application has to make a connection via TLS to the authentication provider.

Hybrid flow

Hybrid flow is when you make a call which requests a code as well as tokens. It has all the security disadvantages of implicit flow, and I can’t see why you’d want a code if you’re going to be given tokens anyway.

Authorization flow in detail

It’s pretty clear that authentication flow is the one to choose for security reasons. That’s probably why the authentication providers I need to use at work don’t support implicit flow. So, let me go through the authentication flow in detail, from the point of view of a web application. The OpenID Core document has examples of HTTP requests and responses, so I’ll skip those and just try to briefly and clearly summarize the process.

The first step is to bounce the user via HTTP 301 to the authentication endpoint with an OAuth2 request. The request has a scope of openid and a response type of code, and a redirection URI pointing back to your web application.

The ID provider then asks the user to log in. This may involve pretty much anything you can imagine. In my case, there’s a company login and password, and then I get asked for a TOTP one-time code from my phone.

Once the user has logged in, the authentication provider bounces them back to the URL that was specified in the request. The code value is added to the query part of the URL.

Next, your web application takes the code value and uses it to make a call to the token endpoint. The connection is made via HTTPS (TLS), and authenticated using an ID and secret (password) you were given when you registered your application with the authentication provider.

(The ID and secret may be passed as JSON values, or using regular HTTP Basic authentication. The latter is preferred, according to the standards.)

If all goes well and you weren’t given a fake code, you get back a hunk of JSON containing the tokens you requested. The one you’re probably most interested in is the id_token, and this is where things get a bit tricky.

The id_token is encoded and signed. You have to decode it, check the signature using the public key of the ID provider, check the token hasn’t expired, and only then can you rely on the information in it.

The standard to follow for the decoding and signing is JSON Web Tokens (JWT). If you get this far and paste the base64-encoded id_token string that you got from the token endpoint into the jwt.io web site, you should see the info you want appear as the payload.

So your code needs to load a PEM public key, decode the JSON Web Token, and check its contents using the public key. If the signature is invalid, someone’s trying to impersonate the authentication provider, and you don’t trust the info in the ID token.

If the ID token validation succeeded and you also requested an access token, you might then use the access token to make a call to the profile endpoint, to get additional user information. (Home address, photo, you name it — it’s extensible, like LDAP.)

Libraries you can use

I’d love to say “Go here and use this library”, but at least for Go I haven’t found anything that will do the whole job.

There are many OAuth2 libraries, but none of them seem to support RSA signed JWT tokens.

There’s an openid2go project which looks as if it might work, but it relies on OpenID Connect Discovery being supported. Unfortunately, the authentication providers I need to use don’t have discovery enabled. Furthermore, it doesn’t look to me as if openid2go supports authentication flow.

What about at least handling the JWT part? Well, I took a look at the three Go packages linked from the JWT web site.

SermoDigital’s jose package describes itself as “A comprehensive set of JWT, JWS and JWE libraries”. Unfortunately the documentation is lacking, and I can’t work out how to perform signature validation against a public key, assuming that’s actually implemented.

Next, jose2go. That one’s nice and simple. Given the id_token text you got from the token endpoint as a string in the variable idtoken, the process looks like this:

pubkeydata, err := ioutil.ReadFile("pubkey.crt")
// Check error and deal with it here
pubkey, err := Rsa.ReadPublic(pubkeydata)
// Check error and deal with it here
hdr, payload, err := jose.Decode(idtoken, pubkey)
// Check error and deal with it here

The ID token info you want is then in the hdr variable, and payload can be examined to make sure the expected signature algorithm was used — because you don’t want to accept unsigned tokens. I checked that a tiny change to the payload correctly caused a signature error at the Decode stage.

Finally, there’s jwt-go. This one works too, but the decoding and verifying process is slightly more involved because you need to supply a callback function to look up the appropriate key given the signing method. You get a Claims map with the values you want, and a Valid field indicating whether the validation succeeded. Again, a tiny mutation to the data was successfully detected.

Reflection in Go and modifying struct values

Recently I found myself wanting to write some code to load values into a Go object. The obvious approach is reflection. However, the Go documentation on reflection is somewhat impenetrable, and the accompanying article on The Laws of Reflection only has a single skimpy example involving structs right at the end.

After some trial and error I came up with some robust code to demonstrate examining a struct and altering its fields. I decided to write it up in detail here.

A simple example

Let’s start with the struct:

type Name string

type Person struct {
  FirstName Name
  LastName  Name
  Age       int
}

First of all we define a custom type, so that our fields aren’t all primitive types and our example is a bit more general. Then we assemble a simple struct.

Next, we instantiate an example struct of that type, and pass it to a function.

t := Person{"John", "Smith", 23}
reflectOn(t)

The signature for the function will be as follows:

func reflectOn(s interface{}) {
  // Code here
}

Why the function call for a simple example?

Well, in real code you aren’t going to be performing reflection in the same function where you create an object. By passing an interface{} argument, we manage to completely bypass all type safety, meaning our example will be forced to rely on reflection. In real code, of course, you’d ideally narrow the scope a bit with a more specific interface.

In addition, by putting the reflection code in a function we can call it twice:

reflectOn(t)
reflectOn(&t)

Now we can make sure our code deals with structs passed by reference as well as by value, and we can demonstrate the important difference that makes, concerning whether you can change the fields of the struct.

So, what is a function argument really?

Recall that unlike (say) Java, Go is all about interfaces, not classes. A given object (struct) can implement any number of interfaces, just by providing the right methods. Object oriented programming is done via composition, rather than a class hierarchy.

So when you define a function in Go which operates on objects, you specify the interface which the objects must implement in order to be acceptable to the function. Our function takes an argument of type interface{}. That’s the empty interface, one with no methods specified, so as per the spec absolutely anything implements it — even a primitive type such as an int.

So our function accepts objects with any interface type. What it receives is an interface value in an interface variable.

If you’ve read about the implementation of the Go type system or at least tried to digest The Laws of Reflection, you’ll know that an interface value in Go is a tuple consisting of a type descriptor (“this is a string”) and the type-dependent data representing the value.

So, the first step for our reflection function is to extract those two things:

ptyp := reflect.TypeOf(s) // a reflect.Type
pval := reflect.ValueOf(s) // a reflect.Value

Next we want to look at the type of thing we were passed, to make sure it’s what we expect. However, the Type we just extracted is the specific type of the value — a Person in this case. What we really want to know is whether it’s a struct, before we try to go looking at its fields. So we look at the Kind of the type, which we obtain from the Type by calling Kind().

You might want to try printing out the value of ptyp.Kind(). If you try it with the two function calls:

reflectOn(t)
reflectOn(&t)

…you will quickly discover that in the second case, the interface type’s Kind is Ptr. So although in Go you can often ignore the distinction between a struct and a pointer to a struct, when it comes to reflection the difference is exposed.

So our function needs to know how to deal with pointers and get at the thing pointed to. The reflect package provides a method Elem() which operates on a Value and dereferences to get the Value pointed at. A similar method does the same thing for the Type. So:

var typ reflect.Type
var val reflect.Value
if ptyp.Kind() == reflect.Ptr {
  fmt.Printf("Argument is a pointer, dereferencing.\n")
  typ = ptyp.Elem()
  val = pval.Elem()
} else {
  fmt.Printf("Argument is %s.%s, a %s.\n", ptyp.PkgPath(), ptyp.Name(),
    ptyp.Kind())
  typ = ptyp
  val = pval
}

At this point, our two new variables typ and val contain the Type and Value of the actual struct, whether we were given it as an actual value or via a pointer. Now we can make sure that it really is a struct:

if typ.Kind() != reflect.Struct {
  fmt.Printf("Not a struct.\n")
  return
}

If this seems like a lot of work, remember that in real code you would know whether you were going to call your function with a struct or a pointer to a struct and would just call Elem() or not (as appropriate) in the first line or two of code.

Next, let’s examine the key difference between passing a struct by value and passing it by reference:

if val.CanSet() {
  fmt.Printf("We can set values.\n")
} else {
  fmt.Printf("We cannot set values.\n")
}

If you try the code so far, you’ll discover that if your function call is reflectOn(t) then CanSet() will report that you can’t set values. If it’s reflectOn(&t), you can set values. If you learned programming by learning Java, this probably makes no sense to you at all, but it goes back to the invention of function calls in programming languages. A brief digression is in order. If you’re a C or C++ programmer, you can skip to the next section.

What is the stack?

Back in the 1950s, the programming language Algol 60 was being designed. One of its design goals was to support recursion. This meant allowing an unrestricted number of function calls — functions calling functions calling functions. To do this, Dijkstra invented the stack. (See: A Brief History Of The Stack.)

Each time a function was called:

  1. The arguments would be pushed onto the stack, in specified order.
  2. The code for the function would be called.
  3. The function would pop the arguments off of the stack.
  4. The function would perform its computations, and put its result on the stack.
  5. The function would end, returning to the calling code, which would pop the result from the stack.

In the 1960s, memory was scarce and computers were slow. A 1964 IBM 360 mainframe started out with 8KiB of memory and executed 34,500 instructions per second — so not even equivalent to a 1MHz clock. If you wanted to pass a string to a function, the idea of copying the entire string onto the stack and off again would have crippled performance. So instead, any argument whose data was larger than a few bytes would be replaced with a pointer to the argument.

The same methods were used for function calls in CPL, which was modeled on Algol. CPL gave way to the BCPL programming language, and its successor C.

Nowadays compilers use various tricks to speed up function calls. For example, if all the arguments will fit into processor registers, they get passed that way instead of via the stack. However, conceptually Go still uses the same stack-based argument passing as its programming language ancestors. One difference, however, is that Go will actually shove an entire string or other large data object onto the stack if you ask it to — conceptually, at least.

By value or by reference

When we call our function via reflectOn(t), Go pushes an entire copy of the struct t onto the stack. The function retrieves it as s. The function doesn’t have any way to know where the copy came from. Whatever it does with the copy, the original will remain unchanged.

When we call our function via reflectOn(&t), Go pushes a pointer to the struct onto the stack. The function retrieves the pointer. At that moment, it can access the original structure — so any changes it makes will be visible when the function returns and the original structure is examined.

So although our code makes sure that typ and val are the Type and Value of the struct, in one case they are the type and value of a copy of the struct, and any changes we try to make will be ignored — so Go warns us of this by returning false from val.CanSet(). Notice that whether the value is settable is a property of the value and how we obtained it, not a property of the type of the structure; the struct’s type is identical in both cases.

We’ll get back to this in a few more lines of code. First, let’s see how we look at the fields of the struct. Logically, the fields of the struct and their types are defined in the type definition of the struct, so we would expect to use the typ variable to access the individual fields. And so we do:

for i := 0; i < typ.NumField(); i++ {
  sfld := typ.Field(i)

At this point we have a value representing a field. If you’re used to how Java reflection works you might expect it to be some sort of field class of a particular type that you can use to access the data, but in Go there’s another step to go through.

In Go, the .Field(int) method, when called on a struct, always returns a special StructField object. To get the actual type of the field, we need to call Type() on the StructField. Just as we examined the underlying type or ‘kind’ of our function argument, so we can do the same for the field:

tfld := sfld.Type // The Type of the StructField of the struct
kind := tfld.Kind() // The Kind of the Type of the StructField

OK, now how about the value? Java gives you a single Field object which you can interrogate for both type and value information. Go has two separate sets of objects to handle that. So just as we called .Field() on the struct’s Type to get at the field’s type (via an intermediate StructField), so we need to call .Field() on the struct’s Value to get the field’s value. This time, however, there’s no intermediate StructValue:

vfld := val.Field(i)
fmt.Printf("struct field %d: name %s type %s kind %s value %v\n", i,
  sfld.Name, tfld, kind, vfld)

Running the code at this stage will produce output like this:

struct field 0: name FirstName type main.Name kind string value John
struct field 1: name LastName type main.Name kind string value Smith
struct field 2: name Age type int kind int value 23

So, we’ve decoded our struct completely, down to field level. We’ve extracted both the specific types (including custom types we defined), and the underlying primitive types. We’ve even read out the data.

Writing fields

Now that we can read from the struct, let’s work out how to change it.

You might wonder whether setting a value is an operation you perform on the type, or an operation you perform on the value. In a dynamic language like Ruby, you’d expect to call a type-dependent method to set the value. But Go is statically typed, so you can’t change the type of a field at runtime — only its value. So to change a value you use a Set method on the value of the individual field, as returned by the Field() method of the struct’s Value. And if you try to tell the Value to take on a value of a different incompatible type, Go will panic.

Also, you need to use the Set methods on the Value of the individual field you want to change — not the interim StructField. So let’s try it:

if kind == reflect.String && vfld.CanSet() {
  fmt.Printf("Overwriting field %s\n", sfld.Name)
  vfld.SetString("Anonymous")
}

Notice that the field’s Value has its own CanSet() method, just like the overall struct’s Value does.

So now I can restate the part that confused the heck out of me: You can’t modify the value of a struct in Go using a Type or StructField. To perform a reflection operation in Go you need to go through two separate interrogation processes: first you start with the struct and retrieve all the type information you want and check it, then you start again at the struct and work down the value chain to the field value and change it.

You can interleave the operations, as I’ve done in this example code, but fundamentally you’re dealing with two different trees of information.

You can get the complete code on GitHub with added comments. If you run it, you’ll see quite clearly the behavior difference between calling with a struct, versus calling with a pointer to a struct:

First, passing the actual structure:

Argument is main.Person, a struct.
We cannot set values.
struct field 0: name FirstName type main.Name kind string value John
struct field 1: name LastName type main.Name kind string value Smith
struct field 2: name Age type int kind int value 23
After reflection:
John Smith, 23 years old

Now, passing a pointer to the structure:

Argument is a pointer, dereferencing.
We can set values.
struct field 0: name FirstName type main.Name kind string value John
Overwriting field FirstName
struct field 1: name LastName type main.Name kind string value Smith
Overwriting field LastName
struct field 2: name Age type int kind int value 23
After reflection:
Anonymous Anonymous, 23 years old

Hopefully that covers everything you need to know about reflecting on structs.