Designing for delivery in public services

This post is not about making design fit an organisation.

The public sector often has little choice about what it must deliver – much of the “what” is enshrined in law. It has no choice.

This is a stark quality of building products to meet the needs of citizens. The need for parallel delivery is obvious where new services are being delivered simultaneously (think Welfare Reform Act), also where existing services are being improved.

We have spent time over the last year or so moving towards any one of our teams being able to work on “the next most important feature”. I call these “global priorities”.

However, this is not enough. We must carefully balance how products themselves are described and prioritised to allow some compromise to be made. Where teams are organised around “lines of business”, for example Working Age Benefits, Retirement and Health, this breakdown of products becomes a matter of duty above and beyond the tangible benefits to be had from building a strong affinity between teams and the products they build and run.

What began as a tightly-knit set of empowered people building a service to meet a specific set of needs has grown into the agent for the transformation of a whole government department.

In so doing, we are now finding ways to scale hard things like service design and prioritisation. Succeeding will help us realise the ability for more people to build and evolve brilliant services at an even greater pace. In so doing we really will improve the lives of our users.

They key to this is in understanding how to “scope” products that come together in a user’s journey through a service. Putting too much into this scope will create a bottleneck around those roles in the product team that aren’t easy to scale (product management and UX most notably). Too small and while the product might iterate liberally with all the gusto a small team can throw at it, the overall experience risks becoming disjointed.

If you’re interested in joining the team and putting this into practice – keep your eyes on https://careers.dwp.gov.uk

As we expand, we’re recruiting across most roles in Leeds, Manchester and London, including into the Burbank team in Leeds.

Ethics on the ground 

I had the privilege to speak about “agile data” at an event last week. Alongside me in the lineup were the CEO of Steama.co, an internet of things energy company servicing the African market, and a senior technologist from a leading car manufacturer. 

Each talk provoked discussion and it was great to see people in the audience really getting their heads around the wider issues being discussed, particularly when it came to ethics. 

Two of the talks touched on data sharing and data collection (and retention) infrastructure and it was genuinely reassuring to see audience members – practitioners – questioning the ethical framework within which these projects sit from at least two perspectives: “is it ethical to do that?” and “is it ethical to not do that?”

It’s all very well having CEOs, university professors and politicians calling out sociological and ethical implications of technology (think Musk and Hawking on the potential dangers of AI development), but having it thought about and having opinions form at the working level is what will really make the difference. Bravo. 

Five real things about security in agile

Things we’ve learned along the way

These are things that with the benefit of hindsight, if I ruled the world, we would stay very true to from day 1.

Yes defence is hard, yes attacks will eventually succeed and yes the “threat landscape” is ever-changing, but we’re also developing services in a way that gives us the best chance of making this harder and harder for attackers too.

Note that it should be taken that the “what” of your security work should seriously consider the excellent guidance from CESG NCSC. #shamelessplug.

1. The attack model is effective as a set of tests

…otherwise you’re going to overlook even the things you know.

Since security events frequently are in “the long tail”, these cannot be exhaustive. But once a credible attack is understood, it should be encoded as a test.

This way we can ensure regression against a specific attack, and where the tested part of the codebase changes, the developer will need to think about how to rewrite the test appropriately. The security team can then review the tests cases as a smaller codebase.

This should be used to capture the “misuse cases” against each story to ensure that decisions that were made for security reasons are not lost as the service evolves. We’ve found that it is usually possible to articulate these quite clearly as “Given…when…then…” statements.
Given that a user’s machine has been compromised with man-in-the-browser malware
When they attempt to log in and additional POST fields are detected
Then their account should be labelled as potentially compromised

Hire proper penetration testers to help you build the tests. I say hire, because what really matters is that the security team build familiarity with how the service works as much as what it does. “Working software over comprehensive documentation.”

2. Put security controls in the build

…to avoid missing the point.

Or more specifically, this is a specific way of ensuring that the environment developers work in is as similar to production as possible. This allows the attack model to be played on an individual build as effectively as it would be on staging or production. This has potential architectural implications, and indeed it may be expensive in high-assurance environments.

By way of an example, let’s take the implementation of layer 7/”next generation”/web application firewalls performing validation of an HTTP payload between microservices. There are many ways to implement this functionality, many of which take the form of (very good) appliances.

However, what we’re really after here is a validation routine that only just lets through appropriate traffic as well as to focus the attention of our security team. If we implement the validation functionality either in code/configuration accessible by the developer we have a better chance of succeeding in our first goal, and the security team have a better shot at spotting changes than having to read through the whole codebase to spot a change to the message format.

This helps even if you choose to implement the real firewall functionality differently across environments – ultimately we’re talking here about the quality of the validation configuration.

3. Don’t put security controls in business processes

…of the intention will be lost upon first contact. Seriously.

Or if you need to, be really sure you’re going to revisit them, remembering how the control works and why the control matters.

An example could be asking auditors in a large firm to avoid working on their own cases or those of their friends/family.

This is often done (and accepted) due to the scale of a service at a given time. It’s very intuitive (and – for a given point in time – correct) to say “hey, there are only 100 users, the impact or probability and therefore the risk of this going wrong are low”. However, as you scale it’s extremely difficult to stay on top of these, particularly where managers and users have the freedom to innovate how they use your service.

At the very least be explicit about where these controls exist – at the least in a list somewhere – and revisit it frequently in your prioritisation meetings.

4. Focus on getting good at monitoring

…so you might have a chance at spotting it happening and dealing with it.

Collecting and storing logs is easy. Getting value from all that data in the form of detection of security events is hard. The system evolves, attack methods are varied and validation of events take time.

Whatever your approach to detection, practice. Remember “the attack model is effective as a set of tests”? Well run them. Make sure you’re plugged into a safe environment on which to run them in. If the team has got it right, this environment should be close enough to the real thing to make this practice meaningful.

Practice on production too – test new attack hypotheses, investigate the small things sometimes and – most importantly – get really close to the people doing dev and ops because they’re going to be manning the guns with you when things get real.

5. Build trust in security through empathy

…or it will be sidelined, ineffective and even risk success.

Too many people have been burnt by bad security. Too many times has security been considered too late, or often in the final throes of governance has the chief security architect held up a red card.

The remedy to this is simple – as a security team fly in the face of traditional assurance practices by showing that there’s skin in the game and by being transparent.

Because security matters so much in the services we build, it must be understood by the whole team. If a threat cannot be explained a counter to it cannot be delivered. If a risk is overcooked, the psychological impact of it, and other risks, will be lessened with time.

I have learnt to take a very harsh default position with anyone with “security” in their name – you have about 3 interactions to demonstrate that you’re there to actually explain security and help deliver before you’re mentally tarred with the “enterprise” brush of doom.