We collect environmental data that exist on many independent government web site, and open to all(including media) in Open Data format(csv) via web download and API. It already supported more than 2000 projects, including media such as Caixin, Southern Weekly.
In China, Government only open realtime detailed environment data, not open detailed history data, such as air quality. But for some study or report, we need detailed history environment data, so can do more complax computing or find more hidden rules. This project matched this request well. Till now, it is the most famouse open data web site in china for envrionment data, we supported many media for their unique report, such as Caixin, Southern Weekly, China Business News, China Data Jouralism Competation…. Also we supported more than 2000 R&D projects, please refer to http://data.epmap.org/projects for details.
We use python crowlers to collect data, ruby on rails to build the website, postgresql to store the data, restfulAPI to open in API mode.
What was the hardest part of this project?
It is the only website in China that provide DETAILED(hour level) environment data, and already supported more than 2000 projects.
What can others learn from this project?
They can use the environment data to find some hidden rules, or compute them with other data, such as health data, communication data and so on.