Hi All,
I’m planning to attack a few Kaggle NLP competitions soon. Is anyone interested in attacking them together?
Prerequisites: 1. Decent grasp of raw PyTorch, 2. Familiarity with HF Transformers (e.g., the material covered in the official HF Course). 3. Minimum time commitment of 10 hours a week. If you’re busy / don’t have time, then this won’t work out.
If you meet the above prerequisites, then I propose a “cooperate to dominate” strategy for each new NLP competition:
- We’ll start off with a Zoom call to discuss the problem statement & do some basic EDA.
- Jointly discuss a watertight CV strategy, architecture options, training procedure ideas, etc.
- Divide the work of reviewing the best Kaggle notebooks each week, extract the best ideas from each, and combine the best strategies to create the best possible model.
- Divide up the experimentation - for example, one fold / one seed by each person.
- Divide the responsibility of reading & summarising relevant research papers & notebooks from past competitions.
- Create a central knowledge repository for each live competition (of what worked / what didn’t etc).
- Use an experiment tracking tool (e.g., Weights & Biases / Neptune.ai) to track all the experiments by all the team members.
- Ensemble the best models to get the best possible rank.
Look forward to your responses!
P.S. The Kaggle team member limit is 5. If more than 5 people are interested, then we can create multiple teams.