Week 8
General Experience
This week, I primarily worked on improving the predictions by experimenting with multiple measures. That is, I tried an L1 regularization error, L2, an elastic net, adding an IRM penalty and adjusting the penalty weight. Most of the work comes from an empirical point of view, not theoritical, so whatever worked best was implemented in our model’s training/prediction algorithm. I have also worked on refactoring and improving the code to make it easier for reproduction, documenting steps and explaining different parts of the algorithm. Finally, I have also plotted our estimators while performing the experiments to asses the validity and reliability of our model predictions.
Results and Experiments
The model with IRM penalty and L1 Regularization: As we can see, it looks like a constant error with a degree of variance. So, I plotted the estimators:
which shows that the estimators themselves are being treated as constants as well, which shouldn’t because our train and test domains come from different data distributions. L1 regularization only:
The results look promising, so I plotted the estimators as well.
Applying L2 regularization or an Elastic net (L1 regularization + L2 Regularization, each having their own weight) with different L1 and L2 weights didn’t get better results; therefore, I used L1 regularization to account for the errors.
Frustrations
There was a bug in my algorithm for about 4 weeks, and I haven’t discovered it because it doesn’t have a significant effect on the results, yet it doesn’t allow the results to be accurate. There was an error calculating the loss and it shifted the value by some interval. However, I was able to catch the bug and resolve it, making the code more robust.
Plans for next week
Next week will be spent on experimentation and testing, trying different conditions for the train/test envrionments and analyzing the behavior of our model in different settings. This will hopefully allow us to decide when to and not to use our model based on the data we have. I will be also trying to test our algorithm on a raw dataset and see how does our model behave compared to pooled data or i.i.d (independent and identically distributed) envrionments.