•  
  • Blog

    data journalism new media tools visualisation

    A mapping toolbox for journalists: 10+ tools worth checking out

    Posted By Alastair Otter

    This is part 2 of an occasional series on useful tools for data journalists. Part 1: Want to be a data journalist? Learn these important tools

    Maps are one of the most popular ways to visualise data and are an easy way to add context to geographically-based datasets in stories. They can also be beautiful and offer a different view of our world.  

    So learning to make appealing and informative maps to support stories is a great skill to learn.

    Over the past few years I’ve found myself making an increasing number of maps to illustrate stories. Partly that’s because I’ve always had a love of maps and partly that’s because so much of the data I’ve worked with lends itself to being visualised on a map. Over this time I’ve also experimented with dozens of mapping tools and libraries and the list below is a shortlist based on the tools I find myself using most often.

    Some of these tools are easy to use. Add some data, choose your preferred settings and you create your map. Some of them (mostly towards the end of the list) require some programming knowledge but if you’re willing to invest the time you can create some great custom data visualisations with them.

    1 – My Maps
    https://www.google.com/maps/d/
    Ease of use: Easy
    This is a great place to start if you’re new to mapping. If you have a dataset that includes columns for addresses or GPS co-ordinates then you’re set. In My Maps create a new map, import your dataset and select the column you want to use as the point marker location and you’re done. You can also add points to the map, or draw polygons to indicate areas of interest or even add direction information to maps. It’s easy to use and versatile.

    2 – BatchGeo
    https://batchgeo.com/
    Ease of use: Easy
    BatchGeo does a pretty simple job but it does it well. Again, if you have a dataset with columns indicating address or GPs co-ordinates then you can paste this into BatchGeo and it will automatically look up these positions and add them to a map. It has fewer features than My Maps but if you’re just after a map with multiple, labelled points of interest then BatchGeo is worth taking a look at.

    3 – MapJam
    https://mapjam.com
    Ease of use: Easy
    MapJam is another really easy-to-use mapping tool. It also has a fairly unique look for its maps which are really nicely styled. Adding information to MapJam maps is really easy and it’s easy to add annotations to the maps. MapJam produces either flat image maps or interactive, embeddable maps.

    A Fusion Table map of the provincial crime rates in South Africa

    4 – Fusion Tables
    http://drive.google.com
    Ease of use: Easy/Medium
    Fusion Tables is part of the Google Drive suite of tools. At first it may not seem obvious what this tool does but it is pretty powerful once you get used to it. Fusion Tables imports most common file formats such as CSV, Excel and KML. Fusion Tables is excellent at geocoding your data, so long as you’ve got a decent location column. One of its most useful features is the ability to merge datasets. Mapping with FT is relatively simple and there are options for creating your own color buckets to illustrate your data. If you want to learn more about Fusion Tables take a look at my Fusion Tables and KML files primer.

    5 – CartoDB
    http://cartodb.com
    Ease of use: Easy/Medium
    CartoDB is a lot like Fusion Tables but with loads more styling options. CartoDB can use different basemap styles and can handle multiple layers of information. Creating a basic, attractive map in CartoDB is pretty easy. It imports most common formats and can geocode country-level data. CartoDB has an extensive set of tools for making really detailed maps which can take a little while to master. The ability to export datasets in multiple formats makes this a go-to tool for me. I tend to do some initial planning work in CartoDB and then export the data as geojson files for use in a mapping library like Leaflet.js.

    6 – MapShaper.org
    http://mapshaper.org/
    Ease of use: Easy
    MapShaper is brilliantly simple. It does only a couple of things but it does them well, which makes it an essential part of my toolset. MapShaper opens most mapping formats such as shapefiles, geojson and topojson and renders those for you. This makes it really easy to quickly investigate the map data that you have. There is also an option to simplify the contour lines of your map which is essential in reducing your eventual file sizes. The inspector makes it easy to see the data that’s attached to all your map points and polygons. Maps can then be exported in a range of formats to use in other mapping tools.

    7 – Mapstarter
    http://mapstarter.com/
    Ease of use: Medium
    Mapstarter is exactly as it sounds. It’s a quick and easy way to turn a dataset into a visual map. Mapstarter opens shapefiles as well as Topojson and Geojson files and immediately renders them to the screen. You can then change styles such as colors and the interactive elements like mouseover info boxes. One of the useful features is the ability to edit the data in your file such as removing features. Maps can then be exported as SVG or image files. But where it really gets interesting is that maps can also be exported as basic D3.js maps. Anyone who has coded a D3.js map knows what a time saver this could be. I previously wrote up a slightly longer introduction to Mapstarter.

    A Leaflet.js-based map showing the marker cluster plugin.

    8 – Leaflet.js
    http://leafletjs.com/
    Ease of use: Medium/Hard
    Leaflet is a Javascript library for mapping and it’s my go-to-tool whenever I need something more customised than most other mapping tools offer. You do need to be able to programme to get the best out of Leaflet but once you do get the hang of it there’s no going back. Being Javascript-based leaflet also works well with most other libraries which makes the options for customising your maps endless. Combine those with this handy basemap chooser and your maps will be unique.

    9 – Mapbox
    http://mapbox.com
    Ease of use: Medium/Hard
    Mapbox is for when you’re looking for that completely unique map, something that no-one else has. And this is where you enter mapping geekdom. With Mapbox you can design your own maps from the ground up. Things like customising road colours, or boundary lines, or place names is done in Mapbox studio. You can then save your own unique map styles and use those in other applications such as Leaflet or CartoDB. Mapbox also has its own Javascript-based mapping library which you could use to build your own custom maps.

    10 – A few other tools
    There are tons of other mapping or map-related tools available. Some of the ones I do use include Color Brewer which is great for finding a color scheme for your maps, QGIS which is great for more high-end map manipulations though quite daunting at first, Tableau Public which is great for general visualisation including maps, Google’s Tour Builder is an interesting way to tell a story visually using a combination of maps and multimedia.

    This list is far from a comprehensive list of mapping tools but but it is hopefully a useful starting place and overview for anyone getting into mapping.

    If you think there is a mapping tool I ought to be trying out send me an email (alastair@mediahack.co.za) or find me on Twitter (@alastairotter).

    If you found this useful, consider signing up for my weekly media and journalism newsletter for more tips.

    Read More

    Data data journalism visualisation

    Visualising the price of petrol

    Posted By Alastair Otter

    For some time now I’ve been working on some ideas around presenting the monthly fluctuation in the petrol price in an easy-to-understand and interactive form. For many people the sudden and seemingly volatile changes in the petrol price inspire thoughts of conspiracy (clearly government is fleecing us) and yet the actual changes in the petrol price are mostly rational and based on well-established principles and numbers. Obviously we, as citizenry, can complain that the various taxes applied to the price of petrol are excessive, or indeed insufficient, but the factors that affect the price of petrol month-to-month are largely out of our hands and even out of the direct hands of government (though their impact on the exchange rate is often fairly clear).

    Yesterday (May 3, 2017) the petrol price again went up, this time by 49c for 95 octane petrol in inland provinces (the usual benchmark).

    With this in mind I decided to try and complete a portion of the the ongoing petrol price project and release that on its own, which I did. This portion tries to explain the major inputs and costs that are used to calculate the price of petrol at the pump. The chart does some grouping of costs to simplify it and breaks it down into the two major components: the basic fuel price on the one hand, and the various taxes, tariffs and costs on the other.

    If the embedded version below doesn’t work correctly please take a look at the full version on petrol page.

     

    For the technically inclined, the majority of the graphic is built using D3.js. There’s a little bit of JQuery in there which I could probably replace with D3 which I may do in the next iteration.

    If you have any thoughts on this, or questions, you can always find me on Twitter.

    Read More

    Data data journalism visualisation

    Visual storytelling: The story of South Africa’s water

    Posted By Alastair Otter

    In the second-half of 2016, and early 2917, South Africa endured one of its worst droughts in close on 30 years. Coming so close on the heels of the previous year’s severe drought it got me thinking. Most of us grow up knowing that South Africa is a water-scarce country(we’re taught that at school) but I’m not sure that we understand what that means, especially as a fairly robust water network has to date meant we almost always have water in our taps. But what happens if that stops working or, as is most likely the case, the system is unable to service a growing population in times of severe water shortages?

    As a first step I began to look at the water storage infrastructure in the country, primarily the dams dotted around the country. Which were the big ones? Which were the small ones? Where are they mostly located?

    That was the first part of the project. The second part was to look at how these dams were replenished. One of the features of the most recent drought was the apparent disconnect between rainfall and water availability. In parts of Gauteng province, for example, there were strict water restrictions in place while at the same time there were floods in those areas. We had floods that wreaked havoc with property and in a few cases claimed lives. And yet there was barely a drop in the taps. That led to some work on the various catchment areas and how they fed the all-important dams that supplied the major metropolitan areas.

    The result was Part 1 of a planned series on water. The project combines narrative with interactive visualisation to try and tell a simple but important story.

    To view the full interactive story visit our water page.

    If you have any thoughts, comments, questions or suggestions on this you can always find me on Twitter. I also publish a weekly media newsletter that you may want to subscribe to.

    Read More

    data journalism

    Essential tips and tools for beginning data journalism

    Posted By Alastair Otter

    As the world of journalism changes many journalists are looking to learn new skills; skills better suited to an industry that is increasingly digitised and visual.  For many that probably entails learning something about data journalism and visualisation. But, if you’re from a strictly printed words background, the change can be daunting.

    For a start there is an ever-growing list of data journalism tools that are available which can be daunting. The question becomes, where to start?

    There is no single right answer. What you need to do is to decide what it is you want to achieve, and your particular working circumstances. If you work in a newsroom and your primary output is in a newspaper then you probably don’t need to learn to make interactive graphics. But if you work online then you may want to learn some data visualisation tools.

    The important thing to understand here is that no matter what kind of journalism you do you can benefit by learning some basic data journalism techniques. And don’t be fooled by the all-to-often portrayal of data journalists as code hackers. There is a place for great programmers but you don’t have to be a programmer to be a data journalist.

    What follows is an opinionated list of tools worth taking the time to explore. Most of these are tools I have come to rely on for a range of different projects, such as data driven stories like this. This is not a comprehensive list of tools, just a shortlist that makes up a good toolbox.

    Part 1: The data journalism basics

    Spreadsheets

    Yes, you can’t escape it. Spreadsheets are the core tool for any data journalism project. Too often journalists fall back on the old pretense that they’re no good with maths. You don’t need a PhD in mathematics to use a spreadsheet but a basic understanding of averages, means, medians and the ability to work with a spreadsheet will boost your reporting skills. If you’re completely new to spreadsheets there are many tutorials online that will have you up and running in no time.

    For most people the first thing they think of when they hear spreadsheets is Excel, which is a great option but by no means the only one. Google Sheets is preferred by many spreadsheet newcomers because its simplified set of options give them the bits they need without the huge array of functions in Excel. If you want something free but powerful, Libre Office spreadsheets is one of the best options.

    Document organisation and collaboration

    One of the challenges in doing data journalism is how to manage large numbers of documents without losing your way. Again, Google Drive is a good starting point. Drive stores all of your documents in the cloud and makes it easy to share these easily with other users. Drive also has built in version tracking, although it’s not immediately obvious, which means you can go back to previous versions of a document if you end up in a data dead end or if you make a mistake.

    While Drive has a ton of uses, sometimes you need something a little more focused on the task at hand. Which is where Document Cloud comes in. Document Cloud is also an online document storage service but it has a number of features that make it a great tool for data journalism. One of the most useful of these is the ability to upload PDFs to Document Cloud and have it convert these to text for you. Not only that but Document Cloud also indexes documents and over time it becomes possible to search across all your stored documents for particular words or names. Document Cloud includes annotations, it can build timelines from documents and makes it easy to embed portions of documents into your online stories. Also, multiple users can collaborate on the documents. Your newsroom will need to apply for an account but the service itself is free for news organisations.

    If you’re looking for something a little different to Document Cloud or Google Drive then it’s worth taking a look at Git and Github. Git has largely been the domain of programmers but increasingly journalists and other writers are turning to Git/hub for a range of reasons. Git is a version control system. You can create files, edit those while being able to revert to previous versions at any point. You can also “branch” files which means creating a second or third version of your files which you can experiment with. If these experiments work out you can then “merge” the changes back into your main files. If not you can dump the experiment and switch back to your original files. If you’re keen to try out Git and Github then do yourself a favour and watch Daniel Shiffman’s entertaining Git and Github for Poets YouTube series.

     

     

    Collecting and cleaning data

    The other reality about data journalism is that it is a rare occasion when you get to deal with clean data. Either you’ll be dealing with dozens of PDF files that need to be converted into something useful and verified. Or you’ll have a dump of messy CSV or excel files.

    If you’re looking to convert PDFs into text/numbers there are dozens of good tools that do good to excellent conversions. The problem is that PDFs are tricky things and your success converting them is largely based on how they are created. PDFs that were created directly from spreadsheets are typically easier to convert than PDFs that are actually made by scanning in a document and then saving to to PDF. More often than not you’ll deal with this latter type, especially if you’re getting leaked data.

    If you’ve got a Document Cloud account this should be your first stop because it has PDF conversion built in. If you’re looking to convert just a portion of a PDF, or multiple similar portions of a document then try Tabula. With a little bit of practice Tabula can be made to do pretty reliable PDF conversions, even if your data is spread throughout multiple documents.

    There are also a number of online PDF conversion tools that work with varying degrees of success. One of the more popular is CometDocs which does conversion to multiple file formats. Zamzar offers a similar service. If you’re looking for something a little more robust then Nitro is worth testing. Nitro offers a free online PDF conversion service but it is also available as a paid-for desktop application. It’s not cheap but it’s very capable if you’re dealing with multiple documents on a regular basis.

    Once you’ve got your data probably need to clean it. If the data is not too messy or detailed then a spreadsheet is a good starting place. But, if you’ve got a file with hundreds or thousands of rows and multiple problems then Open Refine is the tool of choice. Open Refine used to be called Google Refine and it makes it relatively easy clean up dirty datasets. One of its strengths is its ability to work with just portions of your dataset at time. For my money, if you’re going to commit to learn anything then Refine would my choice. Once you’re over the initial learning curve and you discover the power of Refine you won’t look back and there are some good introductory tutorials available for Open Refine.

    A tool similar to Open Refine is Data Wrangler which aims to make it as easy as possible to clean up and manipulate large data sets. I’m not overly familiar with Data Wrangler so my preference is for Open Refine but I mention it because it looks to be a promising tool.

     

    Part 2: Analysing and visualising data

    Once you’ve got your data cleaned and sorted you’ll want to see what the data is telling you. If you’ve read anything about data journalism you’ve probably heard someone say that you need to interview your data like you would interview a source. Just because you’ve got a set of data doesn’t mean you have a story. What you need to do is look at the data in multiple different ways to see what stands out. Also, when you do this you might well spot anomalies in the data, a sudden spike or dip in values. Sometimes these are the stories but often these are the result of a problem in your data.

    One of the easiest tools for doing a quick visualisation or two is Google Sheets. Exel or Libre Office could also be used but Google Sheets is perhaps the easiest of the tools when you’re looking for a quick chart. It’s worth looking at your data in multiple different views to see what the patterns look like.

    An initial view of Vaal Dam levels for every day of the past year. Visualising it this way makes it easy to spot anomalies or missing data points. Those sudden spikes are very likely errors in the data rather than actual spikes.

     

    Another way to do initial visualisation is with one of a number of online tools. One of the easiest to use is Datawrapper which outputs your charts in multiple different ways. It’s a useful way to switch between different views quickly to get a sense of what works well. There are a few other services online, such as RAW or Quartz’s Atlas charts which produce good results.

    Once you’ve got an idea of what you want to do then it’s time to start creating. Most of the programs mentioned above will produce embeddable versions of the charts you’ve made but they may be limited in adding other elements like images, text areas or extra labels. For that you’ll need to look at some other tools.

    Piktochart and Infogram are among the best and easiest at doing this. Both make it easy to combine charts with other visual elements, and if you start with one of the pre-built templates you’ll have something decent looking in next to no time.

    If you’re looking for something more detailed with more than just a few default chart types then you should probably try out Tableau Public which is free and extremely powerful. It can build everything from the simplest charts to complete interlinked dashboards. But be warned, the initial learning curve can be a little daunting for first-timers. If you’re serious about data visualisation then take the time to learn more about Tableau Public. But if you just want the occasional chart to dress up a story then stick with one of the other options.

    Part 3: Maps and mapping

    If you do any kind of data journalism you’re bound to come across geographic data. Which brings up the issue of mapping tools, some of which are simple point and click affairs while others border on the arcane. So you need to think carefully about what you’re trying to achieve with geographic data.

    Too often the first instinct is to plot the points on a map. Which is worth doing in the initial exploratory stages in almost all cases, but often a map is not the best way to illustrate the point of a story. For example, having a map with 200 points all clustered around a small area is often not the most informative way to display data. While shaded contiguous areas to indicate some sort of distribution can be far more effective.

    Having said that, a good map done right can add huge amounts to a data story, so what are the best tools?

    Once again Google is a good starting point. Google My Maps is one of the simplest tools to use. It’s pretty intuitive to use and makes it easy to look up geographic points, draw lines and shape on maps and even add driving directions. If you just want to illustrate where or how something happened geographically then there is no better place to start.

    A step up from My Maps is Google Fusion Tables. This is also part of the Google Drive suite of tools. Fusion Tables in fact does a lot more than just make maps, though that is one of its strengths. Fusion Tables also make it easy to filter data sets, do some cleaning up of data, merge multiple datasets into one and a fair amount more. It’s a little tricky at first but is a good choice when you’re dealing with larger data sets.

    If you’re really getting into this mapping thing and you want a bit more than the previous two options then CartoDB is your next step. Carto is all about maps and it has the potential to make excellent maps with multiple layers and different designs so long as you’re prepared to put in a little initial work. Personally I find Carto an excellent choice for mocking up a quick sample map or merging sets of data to include geographic points. It makes it pretty simple to visualise larger sets of data and make decisions about where you should go with your project. Carto also makes it easy to export the cleaned and fixed datasets into many formats which makes it easy to use in other applications.

    Undersea cables
    The world’s undersea cables as viewed in Mapshaper.org

    There are literally dozens of other applications for making maps some of which are extremely powerful but often also very complex. ArcGIS is popular tool, as is the open source QGIS application but both are aimed at fairly experienced mappers so the learning curve can be steep. If you’re keen to try your hand at making your own map styles then Mapbox is great for that. Mapshaper.org  is another of my most commonly used maps tools because it makes it easy to get a quick visual representation of the data in your map files and it also makes it easy to simplify map shapes, something that can be extremely useful in keeping download times down.

    In conclusion

    Data journalism is a broad area of work with place for many different skills. Some might favour the visualisation side of data journalism while others may prefer the mapping side. No matter what you prefer doing or what the limitations of your newsroom are there is always something more to be learned about data journalism. The recommended route would be to start with the basics above and then gradually move into some of the more detailed areas.

    From experience the best way to learn to become better at data journalism is to practice. Find a real world dataset and see what you can make out of it. It’s only when you’re working in a real world scenario that you’ll really learn the ins and out of good data analysis.

    Comments, thoughts, feedback? You can find me on Twitter or leave a comment below.

    Want to know more? Take a look at an example of the newsletter.
    We won't spam you or sell your details on to anyone else.

    Read More

    media

    That awkward moment when your competitors use your column to promote their paywall

    Posted By Alastair Otter

    The past couple of years have seen a fair amount of mud-slinging between the Times Media Group (TMG) and Independent Media but this week it reached a new crescendo.

    It began with a piece by Financial Mail’s Ann Crotty in which she asked if Independent Media owner Iqbal Surve was stripping the company’s assets. There was a fair bit of speculation in the piece although also enough data to suggest that Crotty is not entirely off course on this one. Crotty is a former Independent Media journalist with a long history of tracking Independent’s owners. (more…)

    Read More

    data journalism visualisation

    Data Viz: What happens when countries rely on coal to grow their economies?

    Posted By Alastair Otter

    Laura Grant and I have just finished a brand new data and visualisation project on the effect coal has on carbon emissions in Bric countries. The data part was all the work of Laura while I spent a lot of time scratching my head over D3.js code (I’m still no expert but I learned a lot). The full visualisation can be found here and Laura has also published a video of the story here.

     

    Read More

    hacks visualisation

    DataViz: Matric 2015 pass rate comparison

    Posted By Alastair Otter

    As part of another, bigger, project I was playing around with some ideas in d3.js and created this simple comparison between the provincial pass rates for Matric 2015 and the national pass average. The blue circle is set to equal 100%, no matter how many students there were in each province. The red circle reflects the pass rate for each province as a percentage. (more…)

    Read More

    New Media

    How a free email newsletter turned a computer programmer into a Newsweek columnist

    Posted By Alastair Otter

    Rusty Foster’s Today in Tabs may be heavy on snark, but it also stands at the intersection of some important trends — the retro intimacy of email, the dance of new and old media, and the next wave of aggregation.

    Read More

    New Media

    What NPR learned about social media journalism in 2015

    Posted By Alastair Otter

    NPR’s social media desk looks back at 2015 and shares some of the many lessons it learned over the course of the year. Put some time aside for this because this is a pretty detailed list of lessons.

    Read More

    Data

    Six tricks to make your data visualisations look better

    Posted By Alastair Otter

    Whatever software you’re using, there’s simply no excuse for accepting the defaults and not trying to make your charts look more professional. But where do you start? It’s easy to look at great design and agree that it’s great, but when you’re looking at an Excel default, inspiration is much harder to come by. What do you change first?

    Read More
    The business of journalism is changing rapidly. Media Hack tracks these changes and delivers news, tips and insight directly to your inbox, every week.
    CLOSE [ X ]