Why not to use Wikipedia as a reference

August 31, 2007 by
Filed under: citation, reference, wikipedia 

I recently had in interesting discussion with my director about references. The goal of references is twofold. The writer, can refer some texts positively. It is the case if he uses an existing algorithm, for example. A reference can also be used negatively. This happens when the writer want to highlight lack in the literature. He can thus justify the originality of his work. As already mentioned in an former post, some book or articles refer to Wikipedia. That’s where things go bad…

Wikipedia is a huge database, an open source encyclopedia (i.e. anybody can contribute to it). The main advantage is that you get a tremendous quantity of articles in any domain. This is a good source to get fast information. But there are two main drawbacks. First, anybody can modify it. Some people may stop me and argue that articles are reviewed by the community. The problem, concerning references, is more regarding the second drawback of Wikipedia: the content changes over time! And this is really bad…

When writing a data mining article, people usually refer to journals or books. The implicit assumption is that the content is fixed and will not evolve over time. If a writer refers today to a data mining algorithm on Wikipedia, he has no guarantee that it will be the same in two, three or ten years. Requirements for reliable references are that they i) cross the time without any changes and ii) that they are easily accessible. Following these ideas, writers should refer articles in this order of priority:

  • Journal articles or books
  • Thesis (if in English)
  • Technical reports
  • Conference proceedings
  • Websites (but they should be avoided)

I think Wikipedia is definitely not a good place to refer. What is your opinion? Is Wikipedia a reliable referring source?

[Thanks to Prof. Ian Smith, for fruitful discussions about this topic]



11 Comments on Why not to use Wikipedia as a reference

  1. Anonymous on Fri, 31st Aug 2007 7:02 pm
  2. You can use archive.org to give you a snapshot of a webpage at a point in time. For instance, here’s wikipedia’s page on data mining as of June 29, 2007:


  3. David Gerard on Fri, 31st Aug 2007 9:33 pm
  4. … or, as we recommend, link to the particular version in the history! http://en.wikipedia.org/w/index.php?title=Data_mining&oldid=133902894 – the version as of June 29, which is actually as of June 27.

  5. Dean Abbott on Mon, 3rd Sep 2007 5:23 am
  6. I’ll use Wikipedia for ideas, but I also realize that it is not necessarily vetted. I still prefer trusted authors and publications for reliable information. That said, I must say too that most of the time, I find the content related to data mining pretty good on Wikipedia.

  7. Sandro Saitta on Wed, 5th Sep 2007 5:11 pm
  8. I think the same problem may happen with archive.org since it may no longer exist in a few years.

    Regarding the particular version history, although it is a solution, web addresses may change

    I agree, that the above mentioned drawbacks happen only in worst cases. But the Web is constantly evolving. However, Wikipedia is definitely a good source of inspiration for data mining and related algorithms.

  9. Will Dwinnell on Sat, 8th Sep 2007 11:46 am
  10. I am reluctant to use anything on the Internet as a reference, unless it is something which has already been published elsewhere.

    The fundamental nature of the Internet is that it makes sharing of information extremely easy.

    This is beneficial to the extent that barriers, such as the economics of book and article publishing are removed. Yay! Now that guy who’s an expert on dragonflies in some small town in Wyoming (and who isn’t associated with a university) has a mechanism for sharing his knowledge.

    This is detrimental to the extent that the Internet also removes the filtering effect provided by more traditional publishing processes. Ugh. Now every idiot who’s got some crackpot theory has a soap-box to stand upon.

    Quality of material on-line, including Wikipedia, is certainly mixed.

  11. Will Dwinnell on Mon, 17th Sep 2007 3:24 pm
  12. Here’s one perspective on this subject:

    The Faith-Based Encyclopedia

  13. Sandro Saitta on Fri, 22nd Feb 2008 5:06 pm
  14. Here are other examples from collegedegree.com

  15. Datashaping on Mon, 25th Feb 2008 1:22 am
  16. Wikipedia has an history of arbitrary censorship, even on subjects such as data mining, or statistics. They censor on a very large scale: more than 50% of the best references are blacklisted on Wikipedia.

  17. Sandro Saitta on Tue, 4th Mar 2008 3:22 pm
  18. Thanks for the information, it is always good to know.

  19. Shilpa on Wed, 6th Aug 2008 1:50 pm
  20. can anybody help me in getting the code related to “Semantic annotation applied to Frequent PAttern Mining”

  21. morgan on Thu, 17th Dec 2009 5:29 pm
  22. i ma a sophmore in high school and do research papers one a week. if i use my information from wikipedia, i get points deducted. my school principal has also made statements about wikipedia saying it is invalad and is not allowed..

Tell me what you're thinking...

  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads

    All benefits given to a charity association
  • Archives

  • Visitors