Guarding against abuse of personal data in the public domain
Allan Chiang says Hong Kong law protects us against the misuse of personal data already in the public domain, a provision that those seeking to reuse such data would do well to understand
Many people believe personal data collected from the public domain - such as the companies, land and vehicles registers, the government gazette, and even the internet - is open to unrestricted use. This view is incorrect.
Personal data, whether publicly available or not, is protected under the Personal Data (Privacy) Ordinance. Imagine the consequences if the opposite were true. Data users may get around the law by deliberately publicising the data in the public domain, and the improper use of personal data that was leaked to the public domain would be legitimised.
Technology has exacerbated the risks of a loss of privacy. Advances in the aggregation, matching and further processing of personal data in the public domain means such data mining is now conducted with phenomenal ease and efficiency. For those who know how, it is easy to profile an individual and generate new uses of the data beyond the purposes for which they were initially collected.
Admittedly, such profiling could generate economic and societal benefits. But at the same time, it poses grave privacy risks.
One example of these risks is the use of personal data to market goods and services. In a report last year, the US retail giant Target was found to have analysed the purchasing habits of its customers so closely that it was able to guess reliably whether a female customer was pregnant and by how many months. In one case, the father of a teenage girl found out that she was three months pregnant after his suspicions were raised due to the increased amount of pregnancy-related adverts from Target arriving in the mail. The fact Target had "data-mined" its way into a customer's womb is clearly intrusive.
It is conceivable that many marketers are using innovative analytics to enhance marketing effectiveness based on data supplied by the customer and data in the public domain. The problem is not so much related to the nature and source of the data but, rather, to the way the data is combined, further processed and used.
Another example is the compilation of bankruptcy and litigation records of individuals by data brokers, based on the judiciary's daily cause lists and cause books as well as the bankruptcy order notices in the government gazette. This is the subject of our recent investigation into a smartphone application which enabled subscribers to search such records by name and view the combined data in one go. The data subjects concerned could be harmed unknowingly if the data is used, for example, for checking their employability or creditworthiness.
First, as different people can share the same name or have similar names, it is problematic to ascribe the data to a target individual according to his name. Second, a person involved in litigation could be perfectly innocent but the database did not as a rule include the court's decision in his favour.
Third, bankruptcy is normally discharged after four to eight years, while the Rehabilitation of Offenders Ordinance prevents unauthorised disclosure of a previous minor conviction, provided the offender has not been reconvicted for three years.
Therefore, the indefinite retention and use of the bankruptcy and litigation data would unduly stigmatise an individual and bar him or her from leading a life free from encumbrances.
A further example is the unfettered access to the companies, land and vehicles registers, as well as other public information sources, thus putting at risk the sensitive data therein, such as Hong Kong identity card numbers, full residential addresses and signatures. If malicious people were to exploit the data, there would be a risk of financial loss, identity theft and personal safety (through stalking and surveillance).
For this reason, we recently secured the co-operation of a website operator to cease operating an index whereby names of individuals and their identity card numbers found in the public domain were listed together to enable a search by either name or number. Such aggregation and processing of sensitive personal data were clearly inappropriate.
This website operation must be distinguished from that of a search engine, which acts purely as an intermediary in providing content data. Without performing value-added operations on the personal data it processes, aggravation of privacy risks does not come into question.
A use-limitation principle in the ordinance provides that personal data should be used only for the purposes for which it was collected or a directly related purpose, unless exempted for activities such as law enforcement, professional due diligence, and publishing or broadcasting of the data as news and in the public interest.
The application of this principle depends on the original purpose of collecting and making public the data in question. In the case of public registers, they are normally set up by statutes.
Ideally, the purpose of a public register should be stated as specifically as practicable in the enabling legislation. If not, it could be implied. Very often, the purpose and limitations of use of information in the public domain have been spelled out.
Having ascertained the original purpose of making the personal data publicly available, the question of whether the reuse of such data is for the same purpose or a directly related purpose can be assessed on a case-by-case basis.
We need to explore the specific context in which the data was collected and the reasonable expectations of the data subjects as to the further use made of the data based on that context.
The test here is whether a reasonable person in the data subject's situation would find the reuse of the data unexpected, inappropriate or otherwise objectionable.
Allan Chiang is Hong Kong's privacy commissioner for personal data