Week 1

General Experience

I have started research on Tuesday, June 2nd. My first week was exhausting and tough. I have skimmed through two research papers: Invariant Risk Minimization and Invariant Models for Casual Transfer Learning. In addition, I have read multiple blog posts about casual inference and causal models while brushing up my multivariable calculus, linear algebra, and probability theory knowledge. I have also looked through PyTorch tutorials to get ready to run emprical experiments about our findings. I think I have learned a lot during this week. I think working online is kind of inconvenient because I am not in a close proximity to the research lab or the mentor. Although they are very responsive and approachable through online communication channels, I still think it would have been easier if we have been in an in-person setting. One of the coolest things I have read this week was “If both Newton’s apple and the planets obey the same equations, chances are that gravitation is a thing” from the IRM research paper.

Findings

  • In most machine learning models, data are often marred by selection biases, confounding factors, and other peculiarities. Therefore, to avoid the excessive reliance of machine learning systems on data biases, we try to leverage tools from causation to develop the mathematics of spurious and invariant correlations.
  • We are investifating a learning paradigm called Invariant Risk Minimization, which basically estimates a nonlinear, invariant, causal predictors from training envrionments to enable out-of-distribution (OOD) generalization. In other words, we collect data under multuple training envrionments, and we consider data from different distributions, and we want to use these data sets to learn a predictor that will perform well across a large set of unseen data but related envrionments.
  • The invariant causal prediction techniques search for a subset of variables that when used to esimate individual regression for each envrionment, produce regression residuals with equal distribution across all envrionments.
  • Transfer learning attempts to address the scenario in which distributions may change between training and testing; therefore, we want to find a set of features or predictors that can work on unseen domains of data.
  • I found that the expected loss function for a casual model using X and Z can be less than the expected loss function for a causal model only using X in certain conditions regarding external errors and internal biases, where X and Z are random variables. X is a causal variable, Z is an anti-causal variable. Manipulating these factors can influence how well our causal model can predict on an unseen test data.

    Frustations

    The mathematics of the academic litrature are complex and complicated. I took so much time to follow through them and understand them. Very tedious work for reduction and algeberaic simplification in addition to using many properties from proabability theory and linear algebra. I think this image summarizes my mathematical struggle during this week. Week1 Frustrations I guess I will need sometime to get accustomed to the mathematical jargon and using the different properties of probability theory and linear algebra.

Plans for next week

I am going to read the research papers in depth, explore the github repos and the algorithms implemented in them, and try to run some tests in python. I still haven’t figured out the answer to the first problem I got, but I hope I will be able to find an answer by next week.

Other information worth sharing!

So far, there have not been any tests or algorithms to write, but I am expecting I will start doing that for next week. In addition to research, as a side hobby, I am learning about data analysis methods that some companies implement and trying to find connections between what we are researching and what data analysis is about. I really do believe that machine learning models are going to change the way we think about data and the world.

Written on June 6, 2021