Software Development and GDPR

Developers everywhere are having various grades of freakout about the EU’s General Data Protection Regulation - unofficial link to HTML version.

The core of this freakout is ambiguity, and it’s a key insight into how developer brains work, especially when the job to be done is something that seems extraneous, dull, or difficult.

If anything, software development is the action of embodying a swirl of concepts and rules that support the creation of a targeted set interactions with people or things. The process is about entirely eliminating ambiguity from the field.

With this GDPR business, we are seeing the legal profession slop over into development, and legal wrangles are all about ambiguity, and interpretation. This is why no piece of legislation in its own right is enough. It serves as a line where we start, but only after that line has been laid down can we practically begin to understand exactly how it affects the real world of people in all that world’s weirdness, complexity and indeed unexpected evolution and change. There’s no system tests here, folks.

To attempt to understand the whole situation better, I enrolled in an Advanced Diploma in Data Protection at King’s Inns, Ireland’s oldest law school. This is a course that looked at Data Protection in general, with a solid focus on GDPR, but taken very much from a legal perspective, although muggles like myself were also invited to join in.

Well, it was fun times, and the whole thing is a little complex. GDPR helps to harmonize over the 28 (for now) countries in the EEA – everyone had different rules and it was super onerous to have to know everything. GDPR is Regulation, which means it goes as written into the legislation of each member state – there is no diffs permitted. Governments are subject to it: this means that it is illegal, for example, for the DoJ to share conviction data with the DMV for the purposes of the DMV checking up on uninsured cars, because, maybe, some dude reckons that all people with a conviction are bad peoples and ergo won’t insure their cars.

That’s a big picture – the entire reason for the presence of the regulation is to help level the playing field for individual humans who are going up against corporations and states.

To come back to the development aspect. The ambiguities persist. This piece of writing was triggered by @DazeEnd and @joec discussing whether if it was ok to just mark a database record with deleted=yes, since it’s ok to delete data on a harddrive, but the underlying harddrive still has the data on it, marked as released=yes (see note below). That is a valid question, from the point of a developer. My assertion is that from a practical perspective, this is the kind of niggle that we must dismiss – and the kind of niggle that occurs when there’s a job to be done that no-one really wants to do. The result of niggling at this level is eventual upgrade to the atomic version – “we are now not available in Europe” – because it looks so DIFFICULT because there are all these EDGE CASES.

You could think, as a developer, that the lawyers worry about this kind of fine-grained issue. They don’t. This is one of those situations where they say, well, here’s the risk, you have to make a decision, document it, and be ready to back that up in front of a judge should the soup hit the fan.

In this particular case it’s straightforward enough. Are you in control of the presence of data in your database? Yes. It’s up to you to delete it when requested. Are you in control of the data on your harddrive? Yes. It’s up to you to delete it when requested. Are you in control of the operating system implementation or database implementation of deletion? No. Could you get the data back if you wanted to? Yes – but that’s not part of your usual run of business, so why would you explicitly do that? What if some bad dude steals your harddrive and then rummages through it? Ok we are getting a little far-fetched here for most businesses that are not keeping special category data, but if this does happen, then you have failed in your security controls.

I guess my overall point here is that GDPR Compliance is a continuum, not a tickbox. You want to be doing the best you can with it and document why you can go so far and not further. The companies that will be getting the big legislative fines are the guys that are willy-nilly exporting special category data out of the EEA en masse without the knowledge of the people associated with that data. The rest of us just need to muddle along as best we can.

Note – if you are using modern TRIM-enabled SSDs your deleted files are scrubbed immediately.