DrivenData

Questions we get often from data scientists

What can I find here?

DrivenData hosts online data science competitions where the data and problem are posed by a socially-minded organization. You get to put your analytic skills to the test in order to tackle real-world problems with real-world impact.

Why compete?

DrivenData competitions are a great way to hone your skills, prove your abilities, build real-world experience, and use your knowledge to make a difference. Some competitions are for just for bragging rights, while many others are for cash and other prizes.

How do I sign up?

You can go directly to the signup page and register a new account. From there, check out the open competitions and join whichever ones you find most compelling.

If you aren't ready to sign up for account but want to be notified of news and updates, hop on our mailing list (no spam, we promise). You'll be notified when we post new competitions.

How does it work for competitors?

Once we release the data, any team can download it, build a model, and make a submission. We typically give competitors a set of data with both the independent and dependent variables. We also release another set of data with just the independent variables, and we keep secret the dependent variables that correspond with this set. You submit predictions for this set and we compare it against the actual values. At the end of the competition timeframe, the best performing team is declared the winner!

What makes a winning solution?

We typically pick one statistical metric to evaluate all the predictions on a level playing field. In addition to performing well on this metric, competitors should also think about the reproducibility of their code and the clarity of their approach. The ultimate goal is to make a difference, so the competition partner should be able to learn from your model and then use it in their work.

Can I use challenge datasets outside of the competition?

By default, datasets used in a challenge are only approved for the purpose and duration of the challenge. In many cases, the data provider may also approve use of the data beyond the challenge, for instance in research publications or for ongoing learning and development. If that's the case, additional guidance and links to the data will be provided on the competition website.

If you have a question for a specific competition you participated in, you can always ask on the DrivenData discussion forum. If you're looking for past competitions with openly available data, you can find the latest on the competition search page.