Who We Are

Who we are

Problem Statement Posted

Setting: You have been hired by the marketing branch of the large subscription company M.K. Nurich based in New York City that offers a variety of magazines and periodicals to their customers. M.K. Nurich has grown to the point where they have a very large customer base with diverse interests. Their old ideas of blanket marketing by approaching all their customers are becoming too expensive to conduct as costs rise and response rates fall due to their population’s increased diversity. Hence, M.K. Nurich has decided to adopt a more targeted marketing scheme across its new campaigns, and at the same time try to expand their base by attracting more potential high value customers.

Task 1

Background:
To start, M.K. Nurich would like you to be able to effectively target new prospects by predicting a customer’s lifetime revenue (defined over a period of five years). M.K. Nurich keeps a database of all their customers’ attributes, as well as their subscriptions and the amount of revenue they generate for the company.

M.K. Nurich wants to run a customer acquisition campaign that will focus on bringing in high revenue customers. It wants you to build a model that rank orders customers in terms of expected revenue, with an emphasis on capturing as many high revenue customers at the top of the scale as possible.

Technical Details:
You will receive both a modeling dataset and a holdout dataset, the “scoring dataset”, randomly drawn from the set of solicited customers within its database. The modeling dataset will be at a customer level and contain a variety of independent variables and the dependent variable for five-year lifetime revenue. The independent variables are a customer snapshot at the beginning of the five year period when the customers were still only prospects. The layout of the data will be the same for the holdout data, except that the response variable will be missing from this data.

Subtask 1a) Total revenue captured in top 40% of population
Your task will be to return a list of these customers, rank-ordered, based on expected five-year revenue. Performance will be measured by total five-year revenue captured at the 40% population cut-off in the list.

As the Task 1 scoring file contains 7,054 observations, we will assess the total actual revenue captured in each team’s top 2,820 observations (~40% of population), which is simply the sum of the actual revenues from the rev_all variable over these observations.

Submission Requirements:
Rank-ordered list of OBS_IDs, from the scoring file, of top 2,820 predicted revenues.

Subtask 1b) Best model RMSE over entire Task 1 scoring population

Details:
This subtask assesses your overall model’s predictive accuracy.

Submission Requirements:
Entire list of OBS_IDs from the scoring file with your model’s predicted value of rev_all.

Task 1

Final Ranking: Teams will be ranked in subtasks 1a) and 1b) and the lowest sum of ranks determines final placement in Task 1.

  • Tie-break: RMSE in top 2,820 observations by actual rev_all values determined by each team’s subtask 1b) predictions

Task 2

Background:
The customer population you have been analyzing is a subset of M.K. Nurich’s possible entire database of customers. M.K. Nurich has data on prospects that it never solicited due to an earlier attempt to model customer revenue. M.K. Nurich used a fairly simple process to decide who would be profitable and who would be not, i.e. it was not a random sample from their entire database and was related to expected profit. We no longer have access to this model but we do know who its current customers are, who was not solicited based on this model, and we have the data available to M.K. Nurich at the time it made the original decision. M.K. Nurich has decided they want to know what the predicted revenue of this population would be based on its knowledge of its current customers and the more advanced models it feels you are capable of putting together. Ultimately, they would look to acquire new customers by expanding their base into this up till now untapped market.

Technical Details:
You will receive another dataset representing prospects who were screened out of earlier solicitations, but became active customers of M.K. Nurich without being solicited. They will have all the independent variables available to M.K. Nurich at the time it originally chose not to solicit these people. As before, the dependent lifetime revenue variable will be missing. It is up to you to predict the five year revenue of these prospects if they were to be customers. You must submit a prediction for each customer in the set.

Please note that you may keep your model from task 1 or employ a new approach. The variables available will be the same as in task 1 as each are prospect snapshots before they became customers. The decision on whether to adopt a new model or keep your one from task 1 is up to you. For this task, in addition to the model and code documentation required for all problems, you are required to submit a short document, one page or less, detailing why you chose your approach. In this task innovativeness and solid reasoning in your approach will be assessed by the judges and used to break ties in the final task ranking. Please also be sure to clearly document any new models in your code.

Subtask 2a) Total revenue captured in top 40% of population
Your task will be to return a list of these customers, rank-ordered, based on expected five-year revenue. Performance will be measured by total five-year revenue captured at the 40% population cut-off in the list.

As the Task 2 scoring file contains 7,596 observations, we will assess the total actual revenue captured in each team’s top 3,040 observations (~40% of population), which is simply the sum of the actual revenues from the rev_all variable over these observations.

Submission Requirements:
Rank-ordered list of OBS_IDs, from the scoring file, of top 3,040 predicted revenues.

Subtask 2b) Best model RMSE over Task 2 population

Details:
This subtask assesses your overall model’s predictive accuracy.

Submission Requirements:
Entire list of OBS_IDs from the scoring file with your model’s predicted value of rev_all.

Task 2 Final Ranking: Teams will be ranked in subtasks 2a) and 2b) and the lowest sum of ranks determines final placement in Task 2.

  • Tie-break: Ranking in technical quality of approach as determined from the reading of the solution description and accompanying documentation by the judges. Judges’ decisions are final.

Overall Judging
First and foremost, please follow all judging criteria set forth in “The Data Mining Shootout Rules” document you received or will receive upon registration, noting that all solutions require an accompanying solid explanation.

In order to determine overall winners, teams will be given a rank of 1 through n in each task, where n is the number of valid entries for that task. If two or more teams are tied within a task, then all teams with the same score will be slotted in the lower rank (e.g. if within the top three teams there is a clear winner and the next two teams are tied, then the two tied teams will each be awarded third place, not second)

Final positions will be determined by the lowest total sum of ranks across all tasks. In case of ties, the following tiebreakers will be applied:

  1. Team with the highest rank in any task
  2. If the above fails to break the tie then the judges will split the tie based on the innovativeness of the solutions presented. All judges’ decisions are final.

Data Release
Please note that data was released on February 28th, 2007. As detailed in the welcome e-mail, each team lead will be e-mailed on or before this date with directions on how to obtain the data. Any clarifications or updates to the problem task will be e-mailed to the team leads at the time updates are made.

Go to Top
Quick Links for Financial and Insurance Consulting Services and More...
Apply For Insurance Consulting Services-Inductis

APPLY TO INDUCTIS

Inductis - Focusing On Professional Financial Consulting & Insurance Services
FOCUS AREAS
Case Study of Best Financial Consulting Services & Insurance Consulting-Inductis and More...
CASE STUDIES
  Select examples of how Inductis teams have achieved results for a variety of clients ...more >>
Case Study of Best Financial Consulting Services & Insurance Consulting-Inductis and More...
New & Events
  The latest form Inductis
...more >>
Best Financial Consulting Company- Inductis
PUBLICATIONS
  Our thoughts on how organizations can elevate their performance ...more >>
Site Map -Inductis
SITE MAP
Contact Us for Financial Services and Insurance Consulting Services - Inductis
CONTACT US
Copyright © 2002 - 2008 Inductis Inc.