Ivory Tower Writing #7: A note on data and sources

This post covers some pointers on data and sources which forms the bulk of any paper, particularly printed and textual sources.

While you’re doing your literature review (or revising an existing one), you should be familiar with sources and data at this point. This post serves as a guide regarding these two things. We’ll cover things like what “good” sources and data are and how not to get bamboozled into using less credible sources. As usual, the material here is targeted primarily for my undergraduate students. Let’s get to it.

Definitions come first

Let’s first define sources and data. Sources are, to put it simply, places where you can find juicy bits of information and/or data. These would include books, journal articles, even right-wing conspiracy websites. Data (singular: datum) represents the smallest building block of knowledge. It can either be qualitative or quantitative. The following can be considered data:

  1. A country’s GDP in 2017.
  2. The poverty rate in a country in 2018.
  3. Total deaths by terrorist attacks from 2001 to 2008.
  4. The date when the Berlin Wall fell.
  5. The fact that Sukarno had a Japanese wife.

Data lives inside sources. You will find useful economic data on the website of the World Bank. You can find the number of tanks a country has within a given year in the annual Military Balance published by IISS. You can find the fact that Sukarno had a Japanese wife in his biography. You’ll know when Pearl Harbor happened from history books (or if you lived in 1942, the newspaper).

Now that that’s clear, let’s move on to the good stuff.

Not all sources are created equal

We first need to face the fact that NOT ALL SOURCES ARE CREATED EQUAL. Remembering this will help you down the line, especially in an age where literally anybody with an internet connection can go online and claim their bullshit to be actual fact. While I do encourage my students to read a lot, they often read from less credible sources (often because these are what’s readily available, since good sources are locked behind paywalls). What’s even worse is, they use these sources in their papers, which further damages the credibility of their paper.

As an exercise, which source would you rate as being more trustworthy and want to cite in your paper?

  1. Your racist uncle who only comes over during Thanksgiving
  2. A boring and tedious 50-page journal article on the epistemology of intelligence
  3. A punchy article from a blogger proclaiming that the AI uprising is imminent
  4. A 1,000-page book on the history of weaving

Options (2) and (4) would be your best bet. Unless you actually want to fail the class. Source (3) would be considered “safe”, but only if the blogger is Sarah Connor or someone with proven credentials.

Journals and books are always preferred

So, what constitutes a “good” source?

You should always give preference to academic journals or books. I can already hear the collective groan of the class.

“It’s too hard to read them!”

“They’re too long and dense!”

These sources have, at the very least, been peer-reviewed. For the uninitiated, peer review is a process where other researchers give you scathing comments on the paper/book, force you to revise minute details, and then asking you to revise again until finally the article/book gets published six years later. In other words, it’s like an editor giving their stamp of approval for an article. This process ensures, to a certain degree, the quality of the work because it, at least, means the writer has used valid sources and data to support their argument.

Digressing a bit. The process, however, is not always perfect. There are some articles that, despite having passed peer review, have been published. One example was the notorious Andrew Wakefield paper, which claimed vaccines caused autism. The paper has since then been retracted and has been called “bullshit” by the larger medical community. However, the damage has already been done. In other words, if you’re reading a journal article that is labelled “RETRACTED”, it’s probably bullshit.

Books, on the other hand, may have been peer-reviewed or not. It’s quite easy to self-publish today. One rule of thumb is to check the publisher. Academic publishing is a small market; you’ll be seeing the same names all over the place. University presses are a good indicator of the book’s quality, such as Oxford University Press, Stanford University Press, etc. You’ll notice these publishers immediately from their boring, standardized covers (looking at you there, Routledge and Springer), indicating they understand their target market quite nicely.

As for “popular” presses (like the ones that publish novels), it’s possible the book isn’t peer-reviewed. If this is the case, you need to do some due diligence as well. You may want to check who the author is by reading the back cover. If they’re an academic, the book is probably a credible source. If they’re a journalist, exercise some caution, because journalistic writing is different from academic writing.

Here’s a neat tip: whenever the author feels the need to assert their credentials, that’s usually a yellow flag. In academic books, the author’s name is written as-is, without credentials. This is often not the case for “popular” books, where the author puts their professional credentials on the front cover.

Overall, these sources should be prioritized in academic writing. This being said, however, there is an epidemic of predatory journals. Long story short, these are less-than-credible journals that publish anything they accept (often for a small “publishing” fee) without little if not any peer review.

The internet

Now, while books and journal articles are indeed the best sources for papers, the world we live in is a fast-changing place. Printing presses simply cannot keep up. A book on the application of robotics in modern warfare written in 2009 will probably be obsolete in 2018. And so, we go to the internet for up-to-date data. This is where you, as the writer, should exercise extreme caution and be judicious in choosing your sources.

Most major news outlets have official websites. These are considered legitimate sources. The data you collect from these websites is credible to a certain extent. That’s because there’s an editorial team whose job is to review article prior to publication to make sure they don’t report bullshit. Of course, like all mass media, these news items are prone to “framing”, hence you may want to read with a grain of salt.

The next best things that you can get online are reports and datasets. Reports usually include a wealth of qualitative and quantitative data that saves you a lot of work; datasets provide a wealth of statistics that you can manipulate to find spurious correlations. These usually come from websites like Pew Research Center or the numerous branches of the United Nations, such as the Human Rights Council. These are usually considered credible sources.

Distinguishing fact from opinion

I find it an important skill to be able to differentiate fact from opinion. A news item can be considered “data”. You can know who is being reported on, what the issue is, and maybe some finer details that you may think relevant. It is neutral, as it does not try to do anything beyond simply saying “Hey, the Berlin Wall fell.”

The opinion article is completely different. It tries to argue for or against a position (like the gun control debate), to persuade you into believing the author’s ideas. It’s like someone saying “Hey, the Berlin Wall fell and this is bad because it means no more Communism.” Notice the difference? Usually, opinions are listed separately in a different column in a newspaper. However, sometimes, these labels are hard to distinguish and opinions can make their way into the news item section.

But here’s where it gets a bit tricky. There’s a bunch of opinions out there. How do you know which ones to trust?

Here’s where a bit of experience comes in. You need to know which sources are credible. Private research institutions, universities, think-tanks, and non-profits these days have a “Commentary” series of some sort. Commentaries are opinions written by (presumably) experts in their fields on recent issues. The only reason they’re doing this could be because (1) it’s part of their job, (2) the issue itself is so new, it simply cannot be extended into an 8,000-word journal article, and (3) they want publicity. Some commentary sources include Project Syndicate and The Conversation. These sources have a basic editorial mechanism, meaning that articles need to go through editorial checks to make sure they are at least not bullshit. They may be biased, but at the very least, you can be sure that they are informed opinions worth your time and not the rants of a racist blogger.


I’ve covered some basics of how to distinguish good and bad sources for academic writing here. Basically, use books and academic journals; those are your best bet if you want to survive in the Ivory Tower, and use internet sources sparingly (unless the nature of your research requires you to engage heavily in internet research).

If you have additional questions, please leave a comment!

Comments are closed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: