Does Hong Kong need a third runway? Big data transparency could settle debate
This week's relaunch of a government statistics website is a step in the right direction, but tech insiders tell Elaine Yau the city still has a long way to go in making big data available
As the buzz over big data grows ever louder, techies are eagerly harnessing their power to capture and process vast streams of detailed information from different aspects of life to address various needs - including those you never knew you had.
Spoken News is among 70 apps created using data that the government began releasing in 2011. Gathering information on availability of parking, nearby petrol stations and road conditions, its greatest use has been in helping drivers figure out how they can steer clear of traffic jams.
The app has been gaining traction, with 60,000 users signing up since it was released last year, says Will Kwok Yu-ho, an IT lecturer at Hong Kong Institute of Vocational Education (Sha Tin) who developed Spoken News with colleague Roy Lam Wai-lun.
"Using GPS signals [from mobile phones or vehicles] it lets the driver view images captured by Transport Department CCTVs. Updated every two minutes, the [video feed] gives a view of roads within tens of metres of his location," Kwok says. "So he can decide whether to turn into another road to avoid the congestion ahead."
Recognising the potential of big-data analytics, governments around the world have supported initiatives to realise its benefits by making statistics and data generated by different departments available to the public.
Hong Kong took a tiny step forward in this direction three years ago with the launch of data.one.gov.hk the government's information portal. The site provided just two sets of raw data - video feeds of road conditions from the Transport Department and the length of vehicle queues waiting to get into the three cross-harbour tunnels.
This big-data initiative has been given a boost with an overhaul of the site, which relaunched this week as data.gov.hk Name change aside, the portal is not only much bigger (3,000 sets of data under 18 categories covering all policy areas), it is more handy with expanded filtering and search functions.
More data sets will be uploaded to the portal in phases, says Joey Lam Kam-ping, the deputy government chief information officer.
The previous version was a trial, during which various departments had to provide their data to the Office of the Government Chief Information Officer for upload, she says. Now they can release information directly to the portal themselves.
The office is also reaching out to non-government groups and the private sector to provide a more exhaustive database, including information such as the location of disabled-friendly facilities, or real-time information on vacant bays in privately run car parks.
But while welcoming improved access to official statistics as a shot in the arm for transparency in government, data scientists and app developers say Hong Kong lags behind advanced countries in making use of big data for research and developing apps.
A major flaw with the revamped portal is that it only provides real-time information, says Cyrus Wong Chun-yin, a research coordinator in multimedia and internet technology at the Hong Kong Institute of Vocational Education (Lee Wai Lee).
"You can't get data that's a minute old," he says. "Having only real-time data is useless. You cannot conduct any analysis without past information to show trends."
Because the portal did not present any accumulated data, Wong says his department had to devote considerable resources to capturing the information streaming in real time on a platform so that students could make use of it to develop apps in a project sponsored by Amazon. Fortunately, their sponsorship enabled them to have the gathered data stored in servers in Singapore.
In the West, governments are responsible for storing data and making it accessible, Wong adds.
"Traffic forecasting sites in the US pull together weather statistics such as flood and rain patterns, and combine them with real-time traffic information. By analysing past patterns, you can figure out how heavy rainfall will slow down traffic. Our platform has stored data for only one year. It's not enough to do analysis," he says.
Data scientist Mart van de Ven finds that the portal also lacks a lot of the key data made available in other countries. A co-founder of Open Data Hong Kong, a group promoting the release of information from government and statutory bodies, Van de Ven says the US government releases crime-related statistics as incidents occur, which allows researchers to visualise and understand crime distribution in the cities.
"But in Hong Kong, there's no data from Hospital Authority about waiting times at emergency rooms, for example."
Wong concurs: Hong Kong lacks analysis of big data from official sources, which can help improve policymaking, he says.
"What's interesting about using big data is that you can combine the raw data from different government departments and lay down statistical models to [examine] all kinds of things."
For example, crunching weather data with statistics from the Hospital Authority can have useful applications:
"You can see how weather affects the number of people going to hospitals and develop a flu index like the US does. Then the government can estimate the amount of vaccines they should have available," Wong says.
"As app developers, what we need most is raw data, but the government just produces data summaries. There should not be any concern over privacy issues as you can remove personal information such as names from the raw data about who goes to hospital at what time for what illnesses.
"Release of raw data helps make a government become more accountable and transparent. With such information, the public will not think the government is carrying out policy in an arbitrary way. For example, the government keeps insisting that Hong Kong needs a third runway but we don't know how they do the calculations."
A lack of consistent formats for presenting data from government sources also makes it more difficult to develop useful analyses or applications, Wong says.
"The data released by various departments are all in different formats like XML and RSS. We need to spend lots of time [making them consistent] in order to develop an app.
For all the shortcomings of the portal, Wong's students at the Institute of Vocational Education have managed to develop several apps using available government data.
Compiling data about missing people, wanted criminals and rewards for information leading to their capture, MissingHK offers a glimpse of who the police are looking for.
"This is mostly for fun. You can see which people are wanted by police, including their photos, and details of crimes committed," Wong says.
Another app gives a list of restaurants that are currently licensed, which would interest safety conscious diners.
"While OpenRice tells you about the location of a restaurant and how customers rate it, you would not know if it is a licensed establishment. The Food and Environmental Hygiene Department issues an updated list of licensed eateries every day. The number of restaurants on OpenRice is far greater than on the government list. So you know many upstairs cafes operate without a licence as it is difficult to pass the fire services requirements."
Wong Ka-lok, who is studying for a higher diploma in cloud and data centre administration, hopes the government can release raw data about exam results so he can create an app to help secondary students select which university or major to apply for.
"When I was in secondary school, I found that education officials only release summarised data such as average student scores for various subjects. But when you choose a university, you need to know how well you need to do in public exams to enrol in a particular discipline. Knowing median grades scored by students accepted into different disciplines last year isn't enough to help me with selecting a major. If the government releases data about which subjects each entrant took [for their public exams] and the results for each subject, you can develop an app using algorithms to calculate your chances of entering different disciplines in the various universities."
At the Office of the Government Chief Information Officer, however, Lam insists that Hong Kong is a very transparent society and data from official sources is ready for release to the public.
"It's just that some departments don't know they have such data."
But while the office pledges to encourage each department to make more information accessible, Lam says: "Some data is only for internal use and not suitable for public release."
Previously, the government summarised a lot of data in graphs and other formats as a service for residents; however, officials recognise that residents increasingly prefer to access raw data to make their own analyses and calculations, she says.
Still, Lam defends restricting the official data on its portal to real-time information instead of also providing statistics from past years. The data provided is intended to enable app development, not for research, she says.
"There's no need for the user to know the data from three months ago. For research purposes, there are Legco papers and government reports on weather changes in the past. Storing such a large amount of information will push up costs a lot."
But to data scientists such as Van de Ven, the cost concerns are nonsensical.
"Data storage is extremely cheap these days," he says. "The government should not be deciding what applications the data sets are put to. They should open up their archives and let the public decide what to do [with the information]."