Finding Interests of Visitors through Data Mining

October 24, 2009 by
Filed under: Uncategorized 

In a previous post, I mentioned one of my current projects at FinScore. In this post, I will discuss another possibility of online targeting when customer data and web profiles are merged together (for customers of a given company).

First, I briefly define an interest group (IG) as an ensemble of visitors with the same interests. Each URL of a given website can be mapped into an interest. Examples of interests are Auto, Lifestyle, Entertainment, Sport, etc. A visitor can belong to zero, one or several interest groups at the same time.

Here is how the process works. First, a set of identified visitors (1) are tracked on the website. Since the pages they visit are recorded, one can deduce their IG. We build a model using as input the CRM data of these visitors and as output their IG (binary variables for each IG). In our case, this model is trained using a decision tree algorithm. All clients of the company, in fact the ones that have never been on the website, are scored using the obtained model. It is thus possible to infer their IG before they visit the website.

One possible use of such score is to decide which ad, content or product to show to a new visitor (that is already a client of the company). It is also possible to use this IG information with another channel such as mail, e-mail or phone in a 1-to-1 marketing context. For example, clients having a high score for the sport IG may be contacted for a sport product, etc.

So, what do you think of this approach? Have you tried similar approaches or completely different ones? I’m looking forward to reading your comments.

(1) “Identified” means that these visitors are also client of the company. We can thus merge CRM data with web behaviors, for example in a “log in” area.

incorrect ways to stretch your neckThe collective pees and poops of all the water animals in its vicinity? He tastes that shit in his sleep literally. There is nowhere for this animal to hide from the taste of the nastiness around him. Now think about that time you shoved a catfish down your pants. I believe most major cable and satellite companies will have this video on demand service very soon, if not already. The reason I switched to Direct TV in the first place was because Cox 2 year contract ended and started charging me more for monthly service. Cox did not have the Video on Demand at the time, so I am glad I did.. Fake Oakleys The new TV deal, which is expected Cheap Oakleys to be finalized in 1H 2013, would be the near term upside catalyst. Quick liquidation value of $19 implies zero admission growth and a cheap nfl jerseys modest +15% bump in the new TV contract. Add the roughly 15% upside to the 3.6% dividend yield gives an expected annual return from today’s share price pushing 20%.. Every so often, we would get a couple being, uh, “escorted” in by the bride’s dad. It’s usually clear that he’s the primary motivating force behind this union. We’ll ask “Why are you getting married?” and other cutesy questions to try to upsell them on a themed mug, T shirt, scrapbook, etc., and all the grooms in these weddings do the same thing: Look back at dad and nervously tell us, “Because I love her.” You could turn it into an ad for birth control.. The New Zealanders certainly think so with star shooter Irene Van Dyke having a few choice words for Australian defenders in the media after the game against the West Coast Fever. Oh, and then there was Donna Wilkins of the Pulse spouting off after copping a blow to the back from Firebird Laura Geitz. Admittedly, that shot was probably a little off colour but what Wilkins failed to mention was her shirt front on Geitz before it (something wholesale jerseys that Liz Ellis points out in her article).. Nike boasts the largest, and most well known endorsement portfolio in the industry. Global icons such as Cristiano Ronaldo, Roger Federer, LeBron James, Michael Cheap nfl jerseys Jordan, Tiger Woods, and Kevin Durant all sport the swoosh. Nike’s portfolio touches just about every sport in the world. This is because there are a number of dealers that will offer you NFL Jerseys China cheap jerseys but the quality will be very poor as well. This means that you should be willing cheap football jerseys china to pay extra if it will ensure better quality. You can wear good jerseys for long time, while low quality jerseys are wearable for only short period of time.


2 Comments on Finding Interests of Visitors through Data Mining

  1. Steffen on Wed, 28th Oct 2009 6:30 pm
  2. Hello Sandro

    Interesting project. If I understand you correct, then you build a flat static model based on features independent of the behavior on the website . The target variable then is deduced dependent on the sites visited, so that you can predict the interest of other visitors as long as they a) are logged in and b) are already clients. As you said, they do not even have to visit the site to get scored.

    So far, this strategy sounds good to me 🙂

    I guess the hardest part is to calculate the label. Merging visited sites, time spent on page and maybe the multiple categories of a site into one crisp value is, well, hard.

    Even harder is to identify how interested a visitor, who is not logged in, for certain categories. Now we are leaving the area of flat models, now it is getting interesting …Any plans in this direction ?

    Another point is, that the categories may have a hierarchy (do they ?). I recently experienced that building models for hierarchical models is really non trivial. Any thoughts on this issue ?

    kind regards,


    PS: More posts like that. I like them technical 😀

  3. Sandro Saitta on Fri, 30th Oct 2009 11:21 am
  4. @Steffen: Thanks for your comment. Your explanations really correspond to what we are doing (I’m happy that you got it).

    Just a few points. First, there is only one website where we gather the user behavior. And yes, we predict interests for people that are clients AND that we can identify (either by logging or other means). This identification allows us to make the link between online data (web profile) and offline data (CRM).

    For us, the hardest part is to aggregate the web raw data from the web logs (several Gb per day) in a web profile for each user.

    When confronted to unidentified visitors, we find interest groups after they have visited 1 or 2 pages on the websites using the human-defined rules. But of course, there may be other means of doing.

    We have the following levels: URL -> Categories -> Interest Groups. Even if categories may have hierarchies, we don’t explicitely use this concept in our solution for the moment.

    I’m planning to write more about these topics in the following weeks 🙂

Tell me what you're thinking...

  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads

    All benefits given to a charity association
  • Data Mining Search Engine

    Supported by AnalyticBridge

  • Archives

  • Reading Recommandations