Reflection in Go and modifying struct values

Recently I found myself wanting to write some code to load values into a Go object. The obvious approach is reflection. However, the Go documentation on reflection is somewhat impenetrable, and the accompanying article on The Laws of Reflection only has a single skimpy example involving structs right at the end.

After some trial and error I came up with some robust code to demonstrate examining a struct and altering its fields. I decided to write it up in detail here.

A simple example

Let’s start with the struct:

type Name string

type Person struct {
  FirstName Name
  LastName  Name
  Age       int
}

First of all we define a custom type, so that our fields aren’t all primitive types and our example is a bit more general. Then we assemble a simple struct.

Next, we instantiate an example struct of that type, and pass it to a function.

t := Person{"John", "Smith", 23}
reflectOn(t)

The signature for the function will be as follows:

func reflectOn(s interface{}) {
  // Code here
}

Why the function call for a simple example?

Well, in real code you aren’t going to be performing reflection in the same function where you create an object. By passing an interface{} argument, we manage to completely bypass all type safety, meaning our example will be forced to rely on reflection. In real code, of course, you’d ideally narrow the scope a bit with a more specific interface.

In addition, by putting the reflection code in a function we can call it twice:

reflectOn(t)
reflectOn(&t)

Now we can make sure our code deals with structs passed by reference as well as by value, and we can demonstrate the important difference that makes, concerning whether you can change the fields of the struct.

So, what is a function argument really?

Recall that unlike (say) Java, Go is all about interfaces, not classes. A given object (struct) can implement any number of interfaces, just by providing the right methods. Object oriented programming is done via composition, rather than a class hierarchy.

So when you define a function in Go which operates on objects, you specify the interface which the objects must implement in order to be acceptable to the function. Our function takes an argument of type interface{}. That’s the empty interface, one with no methods specified, so as per the spec absolutely anything implements it — even a primitive type such as an int.

So our function accepts objects with any interface type. What it receives is an interface value in an interface variable.

If you’ve read about the implementation of the Go type system or at least tried to digest The Laws of Reflection, you’ll know that an interface value in Go is a tuple consisting of a type descriptor (“this is a string”) and the type-dependent data representing the value.

So, the first step for our reflection function is to extract those two things:

ptyp := reflect.TypeOf(s) // a reflect.Type
pval := reflect.ValueOf(s) // a reflect.Value

Next we want to look at the type of thing we were passed, to make sure it’s what we expect. However, the Type we just extracted is the specific type of the value — a Person in this case. What we really want to know is whether it’s a struct, before we try to go looking at its fields. So we look at the Kind of the type, which we obtain from the Type by calling Kind().

You might want to try printing out the value of ptyp.Kind(). If you try it with the two function calls:

reflectOn(t)
reflectOn(&t)

…you will quickly discover that in the second case, the interface type’s Kind is Ptr. So although in Go you can often ignore the distinction between a struct and a pointer to a struct, when it comes to reflection the difference is exposed.

So our function needs to know how to deal with pointers and get at the thing pointed to. The reflect package provides a method Elem() which operates on a Value and dereferences to get the Value pointed at. A similar method does the same thing for the Type. So:

var typ reflect.Type
var val reflect.Value
if ptyp.Kind() == reflect.Ptr {
  fmt.Printf("Argument is a pointer, dereferencing.\n")
  typ = ptyp.Elem()
  val = pval.Elem()
} else {
  fmt.Printf("Argument is %s.%s, a %s.\n", ptyp.PkgPath(), ptyp.Name(),
    ptyp.Kind())
  typ = ptyp
  val = pval
}

At this point, our two new variables typ and val contain the Type and Value of the actual struct, whether we were given it as an actual value or via a pointer. Now we can make sure that it really is a struct:

if typ.Kind() != reflect.Struct {
  fmt.Printf("Not a struct.\n")
  return
}

If this seems like a lot of work, remember that in real code you would know whether you were going to call your function with a struct or a pointer to a struct and would just call Elem() or not (as appropriate) in the first line or two of code.

Next, let’s examine the key difference between passing a struct by value and passing it by reference:

if val.CanSet() {
  fmt.Printf("We can set values.\n")
} else {
  fmt.Printf("We cannot set values.\n")
}

If you try the code so far, you’ll discover that if your function call is reflectOn(t) then CanSet() will report that you can’t set values. If it’s reflectOn(&t), you can set values. If you learned programming by learning Java, this probably makes no sense to you at all, but it goes back to the invention of function calls in programming languages. A brief digression is in order. If you’re a C or C++ programmer, you can skip to the next section.

What is the stack?

Back in the 1950s, the programming language Algol 60 was being designed. One of its design goals was to support recursion. This meant allowing an unrestricted number of function calls — functions calling functions calling functions. To do this, Dijkstra invented the stack. (See: A Brief History Of The Stack.)

Each time a function was called:

  1. The arguments would be pushed onto the stack, in specified order.
  2. The code for the function would be called.
  3. The function would pop the arguments off of the stack.
  4. The function would perform its computations, and put its result on the stack.
  5. The function would end, returning to the calling code, which would pop the result from the stack.

In the 1960s, memory was scarce and computers were slow. A 1964 IBM 360 mainframe started out with 8KiB of memory and executed 34,500 instructions per second — so not even equivalent to a 1MHz clock. If you wanted to pass a string to a function, the idea of copying the entire string onto the stack and off again would have crippled performance. So instead, any argument whose data was larger than a few bytes would be replaced with a pointer to the argument.

The same methods were used for function calls in CPL, which was modeled on Algol. CPL gave way to the BCPL programming language, and its successor C.

Nowadays compilers use various tricks to speed up function calls. For example, if all the arguments will fit into processor registers, they get passed that way instead of via the stack. However, conceptually Go still uses the same stack-based argument passing as its programming language ancestors. One difference, however, is that Go will actually shove an entire string or other large data object onto the stack if you ask it to — conceptually, at least.

By value or by reference

When we call our function via reflectOn(t), Go pushes an entire copy of the struct t onto the stack. The function retrieves it as s. The function doesn’t have any way to know where the copy came from. Whatever it does with the copy, the original will remain unchanged.

When we call our function via reflectOn(&t), Go pushes a pointer to the struct onto the stack. The function retrieves the pointer. At that moment, it can access the original structure — so any changes it makes will be visible when the function returns and the original structure is examined.

So although our code makes sure that typ and val are the Type and Value of the struct, in one case they are the type and value of a copy of the struct, and any changes we try to make will be ignored — so Go warns us of this by returning false from val.CanSet(). Notice that whether the value is settable is a property of the value and how we obtained it, not a property of the type of the structure; the struct’s type is identical in both cases.

We’ll get back to this in a few more lines of code. First, let’s see how we look at the fields of the struct. Logically, the fields of the struct and their types are defined in the type definition of the struct, so we would expect to use the typ variable to access the individual fields. And so we do:

for i := 0; i < typ.NumField(); i++ {
  sfld := typ.Field(i)

At this point we have a value representing a field. If you’re used to how Java reflection works you might expect it to be some sort of field class of a particular type that you can use to access the data, but in Go there’s another step to go through.

In Go, the .Field(int) method, when called on a struct, always returns a special StructField object. To get the actual type of the field, we need to call Type() on the StructField. Just as we examined the underlying type or ‘kind’ of our function argument, so we can do the same for the field:

tfld := sfld.Type // The Type of the StructField of the struct
kind := tfld.Kind() // The Kind of the Type of the StructField

OK, now how about the value? Java gives you a single Field object which you can interrogate for both type and value information. Go has two separate sets of objects to handle that. So just as we called .Field() on the struct’s Type to get at the field’s type (via an intermediate StructField), so we need to call .Field() on the struct’s Value to get the field’s value. This time, however, there’s no intermediate StructValue:

vfld := val.Field(i)
fmt.Printf("struct field %d: name %s type %s kind %s value %v\n", i,
  sfld.Name, tfld, kind, vfld)

Running the code at this stage will produce output like this:

struct field 0: name FirstName type main.Name kind string value John
struct field 1: name LastName type main.Name kind string value Smith
struct field 2: name Age type int kind int value 23

So, we’ve decoded our struct completely, down to field level. We’ve extracted both the specific types (including custom types we defined), and the underlying primitive types. We’ve even read out the data.

Writing fields

Now that we can read from the struct, let’s work out how to change it.

You might wonder whether setting a value is an operation you perform on the type, or an operation you perform on the value. In a dynamic language like Ruby, you’d expect to call a type-dependent method to set the value. But Go is statically typed, so you can’t change the type of a field at runtime — only its value. So to change a value you use a Set method on the value of the individual field, as returned by the Field() method of the struct’s Value. And if you try to tell the Value to take on a value of a different incompatible type, Go will panic.

Also, you need to use the Set methods on the Value of the individual field you want to change — not the interim StructField. So let’s try it:

if kind == reflect.String && vfld.CanSet() {
  fmt.Printf("Overwriting field %s\n", sfld.Name)
  vfld.SetString("Anonymous")
}

Notice that the field’s Value has its own CanSet() method, just like the overall struct’s Value does.

So now I can restate the part that confused the heck out of me: You can’t modify the value of a struct in Go using a Type or StructField. To perform a reflection operation in Go you need to go through two separate interrogation processes: first you start with the struct and retrieve all the type information you want and check it, then you start again at the struct and work down the value chain to the field value and change it.

You can interleave the operations, as I’ve done in this example code, but fundamentally you’re dealing with two different trees of information.

You can get the complete code on GitHub with added comments. If you run it, you’ll see quite clearly the behavior difference between calling with a struct, versus calling with a pointer to a struct:

First, passing the actual structure:

Argument is main.Person, a struct.
We cannot set values.
struct field 0: name FirstName type main.Name kind string value John
struct field 1: name LastName type main.Name kind string value Smith
struct field 2: name Age type int kind int value 23
After reflection:
John Smith, 23 years old

Now, passing a pointer to the structure:

Argument is a pointer, dereferencing.
We can set values.
struct field 0: name FirstName type main.Name kind string value John
Overwriting field FirstName
struct field 1: name LastName type main.Name kind string value Smith
Overwriting field LastName
struct field 2: name Age type int kind int value 23
After reflection:
Anonymous Anonymous, 23 years old

Hopefully that covers everything you need to know about reflecting on structs.

Bluemix J2EE basics with Eclipse and WAS Liberty Profile

IBM Bluemix is a cloud platform build around Cloud Foundry, OpenStack, and other popular open source projects. As well as using it to deploy Go, nodeJS and PHP applications, you can use it to develop J2EE applications. To do so, you can use WebSphere Liberty, a clean ground-up implementation of J2EE in a single 60MB download. Don’t confuse it with the old WebSphere Application Server — this new beast can start up in seconds and run in 512MB of RAM.

I decided to see how easy it was to get up and running. Here are my notes:

Step 1: If you don’t already have it set up, download Eclipse and get it running. I used Eclipse IDE for Java EE Developers.

Step 2: If you don’t have them installed already, you’ll need to download and install the Bluemix CLI and Cloud Foundry CLI tools.

Step 3: Download the runtime for WebSphere Liberty Profile with Java EE 7 Web Profile and install it by unpacking the archive somewhere. I unpacked it to /usr/local/wlp, owned by my regular ID (because the directories need to be writeable).

Step 4: As per the getting started with WAS Liberty page, drag-drop the Install link onto Eclipse’s toolbar, and we’re off to see the wizards. You should be led through the process of setting up Bluemix as a Cloud server deployment in Eclipse.

Step 5: Log in to Bluemix via the web console, and create your app using the Liberty for Java runtime.

Step 6: Bluemix will offer you a starter code bundle. Download it and unzip it into Eclipse’s workspace directory.

Step 7: Use File > Import > Existing Projects Into Workspace to import the project into Eclipse. It’ll complain about missing info, but you can ignore that.

Step 8: Right click the project in the Project Explorer, and choose Run As > Run On Server. At this point Eclipse gives you the choice to choose an existing server, or manually define a new one.

If you choose Choose an existing server, you should see IBM Bluemix listed. Chances are the state is shown as Started, because Bluemix started your new application when you created it.

You might hit a Java version problem. First time through, I got an error saying “Project facet Java version 1.7 is not supported”. The solution was to remove the project from Eclipse, restart Eclipse, make sure Java 1.8 was my default runtime, and then re-import the project. There’s probably a way to do it by editing preferences, but after spending a while wandering around in a maze of Eclipse configuration settings I was unable to find it. Short answer: make sure you’re running the same JRE version as Bluemix supports, which at the time of writing is IBM JRE 1.8.

At this point, you can try editing the Java source code for SimpleServlet.java, ask Eclipse to run it on a server, choose Bluemix, and then watch as the Eclipse console builds and uploads the application and redeploys the changes.

Once you’ve done that a couple of times, you’ll notice that it takes a minute or so. To save your sanity, select Run On Server again and choose Manually define a new server. Click the Configure runtime environments link, and we’re off to see the wizard again. In the dialog that appears, pick “WebSphere Application Server Liberty”, and click the Next button.

waslibsetup

In the next dialog, tell Eclipse where you unpacked the WAS Liberty runtime earlier.

waslibsetup2

Finish and OK your way back to the Run On Server dialog. You should now be able to select WebSphere Application Server Liberty as a server type, and choose the runtime environment you just defined.

was-liberty

Clicking Finish should make Eclipse fire up WAS Liberty, deploy your application to it, and launch an embedded web browser to demonstrate that it works.

Finally, you might want to set up a Maven “Run As” configuration for the project, with a list of Goals consisting of the word package. That will get Maven to build and package the entire application into a .war file in the target directory of the project directory. You should be able to use that WAR file for deployment.

All working? git init and put it all in a repository before you break anything.

XPages reliability checklist

When XPages works, it’s great. When it doesn’t work, it’s a pain. Partial refreshes suddenly stop working, user data is thrown away, and forms become unsubmittable.

The root problem is that JSF holds a complex tree of objects on the server, representing the state of all the components on the web page — along with a Domino document for your data, and all the scope variables. Each HTTP request to the back end is accompanied by a $$viewid value fetched from a hidden field on the Web form. The server uses this to fetch the JSF object tree, apply the changes, run the appropriate code on all the components, and then pass back the results.

The good side of this is that you can attach code to components which performs lookups, manipulates document data, and so on — without having to care about the distinction between client web browser and Domino server.

The bad side is that if the hidden $$viewid gets lost, or the server drops the component tree for some reason, the user ends up with a web page that doesn’t work, even though it worked a minute ago and apparently nothing has changed.

One reason why the server drops component trees is to save memory. JSF doesn’t just store the tree once — it resaves it for every “page” displayed to the user, keeping a history of the last N states of the tree. This chews up memory, and when the server runs out of memory, things go wrong.

Having spent a week or so chasing down dying JSF sessions and finally achieving reliability, I thought I’d put together a list of things to check.

Java object memory leaks

The first thing to do is to check that none of the Java code called by your XPages leaks memory. Recall that any Domino Java class is a wrapper around some native C code, and you need to call recycle() on every single Domino object when you’re done with it. Yes, even Session objects.

Domino 9.0.3 is supposed to introduce JVM 8, at which point it should be possible to use an AutoClose wrapper to handle this. There’s also the OpenNTF Domino API, but I don’t use that because it’s incompatible with CORBA/DIIOP, and I have a lot of Java code which runs outside the server and communicates via DIIOP.

This also applies to Domino Java classes accessed using server-side JavaScript. For example, if you load a NotesDatabase and NotesView to look up some data, you need to recycle both when you’re done.

Persistence settings

Under Application Configuration > Xsp Properties > Persistence you’ll find a set of options controlling how JSF handles the component tree.

For the first drop-down (Server page persistence) I favor “Keep only the current page in memory”. This one is rather misleadingly worded. You might think that it means “Keep only the current page, throw away all the others”, but it actually means “Keep the current page in RAM, and put the historical data on disk” — so you don’t actually lose any functionality.

The second drop-down selects page persistence mode. For this, I seem to get the best results when I choose “Only changes since the tree was constructed”. In this mode, JSF builds a complete page tree once when you load the form, but partial refreshes and other updates only result in the changed data being stored in the server session.

(This information is hidden away in the XPages Portable Command Guide.)

Unnecessary data passing

Once you’ve told XPages to only store changes rather than the entire tree each time, the next thing to do is limit the possible scope of changes as much as possible. That means going through your forms and making sure you use partial refresh and partial execution as much as possible.

I’m not sure whether the runtime is smart enough to know that recomputing a value to the same value as before means that it doesn’t need to be stored. Perhaps it is, but it’s a good idea for performance to limit refresh scope anyway.

Session persistence

For the server to keep a session active, it needs to know that the client is still there. The XPages Extension Library has a Keep Session Alive control, but it doesn’t seem to work very well. Instead, I use a period partial refresh.

Server memory

Finally, it goes without saying that your Domino server should have plenty of RAM and disk space. However, don’t be fooled into thinking that just because the server has lots of resources, it can’t be throwing away sessions for resource reasons — even a 64 bit server with 40GiB of free RAM will drop sessions, in my experience.