Earlier this year, as part of another project, I downloaded the South Africa education department’s list of schools. At the same time I had a copy of the 2013 matric results for all public schools (at that stage in PDF format). It occurred to me it might be worth trying to put the two together to create some sort of schools information tool.
Of course, like most things, that is easier said than done. The first major hurdle was getting the matric results out of hundreds of heavily designed PDF pages. Pretty much nothing was up to the task although Tabula did the best of the lot. Eventually I gave up the task and went straight to the education department. It took a day or two but eventually I had a copy of the 2013 results.
With two handy spreadsheets in hand it seemed a simple task to join them together (they both had school IDs which ought to match). Again, not so simple. There were lost of discrepancies between the two lists and they were in serious need of cleaning (OpenRefine to the rescue).
Because I was doing this as a side project I spread the work over a couple of weeks, picking it up when I had some spare time. Verifying the joined tables was pretty time consuming, as was getting the data into a format that I could insert into a MySQL database. But eventually it was done and I could start building a front end to it.
The current version is a relatively simple School Finder. It includes all of the schools info I could get my hands on, for both public and private schools. For now it only includes the matric results for public schools. I am now looking to see how I can add in private schools data.
The final product is built using PHP, MySQL, JQuery, and Google’s maps. The maps are static for now because I had some issues using the interactive version with a large number of data points.