Cetli ("Shopping List") Competition

Participation
You have to log-in in order to participate
Further information
The challenge description

Task1: Guess The Product!

Description

We provide the train.csv file, which contains shopping list items from January to August 2016.

We take the shopping list (“Cetlis”) of July and August, and randomly remove one item from each Cetli. The task is to predict the missing item (the target variable is the ‘Product’ column) of each individual shopping list.

Files

train.csv - the training dataset

test.csv -  shopping list items whose product name needs to be predicted

sample_submission.csv - a sample file, which looks like a submission file

Evaluation

Submissions are evaluated using the multi-class logarithmic loss. For each Shopping List, you must submit a set of predicted probabilities (one for every Product in the dataset). The formula is:

 where N is the number of ShoppingLists in the test set, M is the number of product names, log is the natural logarithm, yij is 1 if observation i is of class j and 0 otherwise, and pij is the predicted probability that observation i belongs to class j.

The submitted probabilities for a given visit are not required to sum to one because they are rescaled prior to being scored (each row is divided by the row sum). In order to avoid the extremes of the log function, predicted probabilities are replaced with max(min(p,1−10−15),10−15)

https://www.kaggle.com/wiki/MultiClassLogLoss

UPDATE:

Public Score is being determined from only a fraction of the test data set (600 lines out of the 801). Private leaderboard (calculated on the whole test set) will be published after the competition is closed. Final prizes will be determined based on private score.

Submission Format

You must submit a csv file with the objectIds of the Cetlis, and all candidate product probability for each class (260 columns). The order of the rows matters, it is in alphabetical order. The file must have a header and should look like the following:

objectId,ablakmosó,alma,...

hjdsme6Rt2,0.1,0.8,...

PzN7frABpB,0,0.3,..

Task2: Freestyle task

In this task, you can suggest a freestyle research plan, without any limitation. You should define a goal or a question to answer, and you can submit a research plan as to how to solve it.

This can be a visualization, a route planner, a shopping behaviour analysis, and you can use additional external data, the options are up to you. We suggest that you maintain regular contact with the organizers to share and discuss your findings.

If you wish to participate then please send us a short description, explaining the way you want to use the data, and the insights that you expect from your analysis.

Plans and solutions should be sent to hello@bevasarlocetli.hu

 

Timeline

Competition starts: 15th September 2016

Final submission deadline: 15th November 2016

Deadline is at 11:59 PM GMT+1 on the corresponding day unless otherwise noted. The competition organizers reserve the right to update the contest timeline if they deem it necessary.

 

Prizes

Task1:

1st place - A VIP Ticket for the Big Data Universe Conference 2017 (http://www.bdu.hu ) + 50 USD Amazon Gift Card + Big Data T-Shirt

2nd place -  100 USD Amazon Gift Card + Big Data T-Shirt

3rd place -  50 USD Amazon Gift Card + Big Data MUG

 

Task2:

1st place - A VIP Ticket for the Big Data Universe Conference 2017 (http://www.bdu.hu ) + 50 USD Amazon Gift Card + Big Data T-Shirt

2nd place -  100 USD Amazon Gift Card + Big Data T-Shirt

3rd place -  50 USD Amazon Gift Card + Big Data MUG