Posts Tagged ‘democratisation of data’

Open Government = Hacking?

Tuesday, October 13th, 2009

Another day, another great news article on the growing rise of the “Democratisation of Data”.  One theme that seems to be emerging however, is that the opening up of this data is more about snooping and prying and highlighting problems/discrepancies rather than improving citizen engagement and services.

goodbadAlthough the overall sentiment in this article is positive, I am bemused by the use of the term “hacker”.  It’s perhaps just too broad a term – one one hand it means simply “an enthusiastic computer hobbyist“, on the other, “a person who breaks into computers“.  I actually have a problem with both of these definitions in this context.

The opening up of government data should be about empowering normal citizens to make informed choices about the services they need, use and more importantly about the role they can play in their local/national community.  Why then, do we feel the need to paint this excercise as something only the cyber elite can particpate in ?

It worries me deeply as I think it scares people off from engaging, preventing them from thinking about what could be achieved and simply re-inforces the “technology is bad/complicated” message we see all too often in the media.

We need to turn this around, just for a change, why don’t we lead with the quote Chris Taggart from Openly Local ends the article with – “It’s about engaging the community“.

Differential Privacy

Friday, October 9th, 2009

PrivacyEarlier this week I blogged about the growing evidence of governments opening up their public data at both a national and local level. While this in itself represents a great leap forward it brings with it a new set of challenges the we will need to address. One in particular stands out and it is around the evolution of some of the very real challenges we’re going to face around Privacy in a Web/Gov 2.0 world.

Earlier this month I was chatting to Stuart Aston (one of our security advisors – you know the type, smarter than your average bear and very switched on to the evolution of the security principles we will face in an increasingly connected world) and he introduced me to the concept of “Differential Privacy“. He left me with a few white papers and a smile and a few hours later, with my head pounding and eyes bleeding (trust me you want to try and read this stuff) I finally got my head around the concept and what it’s going to mean to us as citizens.

Differential privacy is essentially, the ability to make very specific conclusions (with incredible accuracy) about the identity of an individual when provided with two disparate sets of anonymised data on a similar topic.

The example given uses NetFlix’s recent competition to improve their recommendation system as the backdrop…

DiffPriv

NetFlix published an anonymised data set of around 500,000 records in order to help developers come up with a solution to improve their recommendation system. Some bright sparks took this data and a similar export from the IMDB and by applying some fairly hairy maths, they were able to identify specific individuals with a shocking 96% accuracy rate.

This is mind blowing, not just because of the maths involved, but because of what it means in a world of growing public data, the old bastions of Privacy that we have relied upon thus far may no longer be enough.

Governments and organisations are going to need to take this seriously as it will present some difficult challenges about liability and the duty of care to keep their citizens/customers identity and data private.

In particular, think about the duty of care element. As an organisation, you have a legal requirement to look after the privacy of the data you hold on an individual or organisation – with differential privacy, how far does this duty of care extend? If you keep your data anonymised but others can compromise that privacy (albeit with hairy maths and more public data) who is actually liable or legally responsible for the breach?

There are some tough answers to be found here and undoubtedly some more legislation will be required – in the meantime though, it’s a concept we need to understand more so we can build appropriate responses that don’t restrict the overall movement towards making public data more readily accessible . We cannot afford to let this (and other similar issues) stop the democratisation of data, but we do need to go into this with our eyes open.