Stock Prediction using Decision Tree

This is the first post in a series on using Decision Tree for Stock Prediction. Here are the second, third, fourth and fifth posts.

I have started applying data mining to finance for a few months now. I will thus give you an insight about my main project regarding stock market prediction. While starting in my company, I have seen several projects (so-called “screener”, i.e. based on technical indicators to build stock picking rules, but no use of data mining). Most of them make two assumptions:

  • The rules based on technical indicators don’t evolve in time
  • Stocks are selected (and sometimes processed) differently according to the sector they belong to (e.g. health and care, industry, etc.)
Since I don’t feel good with these two assumptions, I have started a new project based on the following idea:
Each technical indicator may work for a particular stock and at a certain moment in time
This means that i) rules based on indicators should evolve in time and ii) each stock should be processed independently. Note that the second point doesn’t mean that there are no correlation between a particular stock and the sector it belongs to. It only means that stocks may behave differently and thus should be treated independently. However, any information from their sector could be used in the forecasting process.

When seen as a balck box, the system has information about a specific stock (such as open, high, low, close, volume, etc.) as input and a class value as output. The class is fixed this way:

1 if close[j+n] > (x% * close[j]) + close[j]
-1 otherwise

where n is the difference between the current day and the day predicted and x is a value chosen to take transaction fees into account (note that a fixed value could also be chosen instead of a percentage). The class predictions are thus made for each stock independently. One year daily data is used for training and the following month for testing. A shifting window process is made so that the system adapts itself to the current market.

Here are the different steps of the overall methodology that makes use of decision tree for stock prediction:

1. Stock filtering
2. Data preprocessing
3. Classification tree
4. Risk management

In the following posts, I will explain in details each of these steps.




10 Comments on Stock Prediction using Decision Tree

  1. Pedro on Thu, 25th Sep 2008 10:44 pm
  2. Hi!
    Do you want to share some knowledge with me?
    I’m thinking to focus my Master BI Degree in forecasting stocks… and I already have some ideias!
    Good Blog!!!!

  3. Sandro Saitta on Mon, 29th Sep 2008 12:53 pm
  4. Thanks for your comment Pedro. I can give you a list of books/articles about the subject if you’re interested. Regarding my personal experience, you will have an excerpt with the following posts on DMR.

  5. Shane Butler on Wed, 1st Oct 2008 1:03 am
  6. Hi Sandro,
    Do you assign classes of -1 and 1 only and or a scale between?
    Cheers, Shane

  7. Sandro Saitta on Wed, 1st Oct 2008 8:13 am
  8. Hi Shane,
    I’m using -1/1 for the classes (i.e. I have only two classes) but I use a more complex function for calculating the accuracy of my decision trees: I take into account the difference between close[j+n] and close[j].

  9. Themos Kalafatis on Tue, 7th Oct 2008 11:27 pm
  10. Hello,

    I was wondering as to whether you think that enhancing your models with Financial facts, your models could achieve higher accuracy? WHat is your opinion on this?

    Many Thanks!

  11. Sandro Saitta on Thu, 9th Oct 2008 3:16 pm
  12. Hi Themos,

    Very interesting question. First, I would like to state that I don’t believe that technical indicators are better/worse than fundamental indicators for stock picking. In this project, I have however decided to work with technical indicators. Therefore, I make the assumption that all information about the market is contained in the price of stocks. This is related to the old traders quote: “Buy the rumor and sell the news” (i.e. it’s too late to look at the news because the market has already been altered by the news).

    This is why I don’t use financial facts or news. However, a lot of work has been done on text mining on financial news to predict stocks evolution and I won’t be surprised that it could work. This is just another way of thinking a system.

    I hope my answer is clear enough.

  13. Themos Kalafatis on Thu, 9th Oct 2008 10:19 pm
  14. Hi Again Sandro,

    Thanks for your reply, i am looking forward for your findings.

    Best Regards,


  15. amir on Fri, 31st Oct 2008 10:16 pm
  16. hi sandro,
    great post actually..
    i am doing my final project this semester, and the topic is ‘apply data mining in retail business’…
    regarding your post, you briefly describe how decision tree can assist in stock prediction… if you dont mind, can you explain to me in term of algorithm itself..
    you state that there have 4 steps.. can you explain me each of steps.. :p if you dont mind..
    i really need your help..
    n i appreciate most your kindness..

    best regard,

  17. Sandro Saitta on Tue, 4th Nov 2008 11:57 am
  18. Thanks for your comment Amir. In fact, the four steps correspond to my overall methodology. It is not specific to decision tree.

    Regarding decision tree itself, I would suggest the book by Tan et al., Introduction to Data Mining.

    Hope it helps.

    […] […]

Tell me what you're thinking...

  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads

    All benefits given to a charity association
  • Archives

  • Visitors