About the challenge

A typical Data Science project involves multiple phases, including identifying business opportunities, formalizing the problem, gathering and processing data, developing and training the model, and integrating the solution into a production environment. For this assignment, we have defined the problem and prepared the data, and your task is to develop a model that will use the available data to solve the task and provide relevant predictions. 

In this task, we will focus on users who have previously played Top Eleven and decided to return by reinstalling the game. These users, who had played the game before but stopped at some point, are referred to as “re-registrations” when they return to the game. By analyzing their data on the day they re-register (registration_data_training.csv), along with data from their “previous lives” (previous_lives_training_data.csv), we can discover ways to enhance their gaming experience – note that some users may have had more than one previous life.

Information about their previous lives, along with their interactions on the day they re-registered, provides important insights into the behavior of these re-registered users. With this data, we can make more accurate predictions about their future behavior, allowing us to create personalized experiences and content tailored specifically to them. This type of analysis is important for identifying users who have shown they can be active, which helps us understand their needs and expectations better.

Submission format 📃

You should save your results in the
“days_active_first_28_days_after_registration_predictions.csv” file.

The submission file should contain a row for each user in the test dataset and 2 columns: user_id and predicted_days_active_first_28_days_after_registration. Here is an example:

  • The submission should be sent via email to jobfair@nordeus.comwith a link to your GitHub repository (email subject: Data Science challenge). Please add your full name to the email! 🙂
  • Besides the file with predictions, the repository should contain all scripts/ notebooks/ visualizations/images with code that shows how predictions and exploratory data analysis were made. 
  • For this challenge, you can use the language of your choice, preferably Python or R.

The challenge is open until November 17, 2024, end of the day. Good luck!

DOWNLOAD CHALLENGE