Week 9 & 10

General Experience

Well, these are my last two weeks of this amazing research experience. During Week 9, I was refining the results, trying out different methods to improve our model, testing different layers with the set transformer, and implmenting different experiments. In the process of writing my technical report, I have read additional papers discussing similar algorithms and approaches.

General Experience

This week, I primarily worked on improving the predictions by experimenting with multiple measures. That is, I tried an L1 regularization error, L2, an elastic net, adding an IRM penalty and adjusting the penalty weight. Most of the work comes from an empirical point of view, not theoritical, so whatever worked best was implemented in our model’s training/prediction algorithm. I have also worked on refactoring and improving the code to make it easier for reproduction, documenting steps and explaining different parts of the algorithm. Finally, I have also plotted our estimators while performing the experiments to asses the validity and reliability of our model predictions.

Results and Experiments

The model with IRM penalty and L1 Regularization: IRM penalty and Regularization term As we can see, it looks like a constant error with a degree of variance. So, I plotted the estimators: IRM penalty L1 Regularization parameters

General Experience

I spent these two weeks on conducting experiments, designing frameworks, and testing, iterating my algorithms. I have been mostly interested in comparing different algorithms and techniques. In addition, I made sure that the experimental results follow the mathematical formulas and observations I obtained earlier. It has been joyful and frustrating, exciting and boring! Good thing is that there were not any specific readings for these two weeks. I spent most of it reading threads on StackOverFlow and consulting github Discussion to resolve my code errors. My algorithm was mainly a domain generalization using causal and anticausal features through a set transformer, ending up with a model performing better than ERM done on a pooled dataset from multiple source and transfer domains. I ended up with a good start, but it is still needs a lot of modifications, refactoring, and editing.

Experiments and Results

My algorithm used a neural network with different layers, and the following were preliminary results. test1

General Experience

I guess easiness isn’t guaranteed when it comes to research! This week has been a very challenging week. I started the week with learning about Neural Networks. I have encountered Neural Networks in my Machine Learning Class at Whitman before, and I have actually trained some models, but I haven’t gone deep into understanding them like I did this week. I have also realized the ubiquity and utility of open-source since I am currently using/reusing multiple libraries, packages, and codes from github! I have gone through the Set_Transformer repository, and it has been very helpful. In fact, I have trained a set transformer to predict certain properties of datasets –having those dataset coming from differnt distributions. In other words, let’s dismantle the i.i.d (independent and identically distributed) belief about machine learning datasets!

Readings (Litrature, online blogposts, and tutorials)

I have read Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks, and I have gone through the following tutorials for reviewing my neural networks knowledge:

Finally, I have gone in detail through the Set_Transformer repo. It has been a lot of readings and learning this week. Very engaging, challenging, and profound. For my own personal interest, I have also gone through a paper discussing accountability for acquiring machine learning datasets: Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure.

General Experience

Well, this week has been easier than the previous week. Mostly because I have done extra work during week 3. I guess I will call this week “Crash Course on software development and object-oriented programming using Python.” I have created a general framework for our training file, including initial, basic, and linear algorithm. In addition, my code was mostly modular and functionally structured such that adding new models to test, new algorithms, new learning frameworks, etc. wouldn’t affect the functionability of the code. I have also learned how to use Argparse very well. I think I will use it in my upcoming projects! I have also learned about Invariant Risk Minimization, Adapative Risk Minimization, and DomainBed. My main motivation for reading those papers was trying to understand their experimental framework and how we I use their basis to understand the algorithm developed this summer.

Readings (Litrature, online blogposts, and tutorials)

I have read the Invariant Risk Minimization paper and the Adapative Risk Minimization. To understand the Adaptive Risk Minimization paper, I had to go through an article about Meta-Learning. Also, I have read a blog post about Learning Theory: Empirical Risk Minimization. Finally, I have gone through the DomainBed repo, and looked through their code and skimmed through the different algorithms provided by the repo.

Experiments & Algorithms

General Experience

This week was a roller-coaster. I have had so many struggles to install packages, setting-up environments, downloading tools, fixing code, refacroring, writing/rewriting/deleting, etc. However, I was able to create a graphical causal model and run experiments on The LUCAS lung cancer toy dataset. I learned about networkx library for dealing with graphs. I have also learned a lot about using args. I have reviewed by OOP (Object-Oriented Programming) skills, brushed up my numpy, matplotlib, and pandas knowledge. I have been also working on research blogs, which are more or less a litterature review. I still do believe I need tons of practice, but I feel like I am getting there. It’s been a very exhuasting, laborous weeks, but it was enriching and interesting as well. I guess that’s what this research about!

Readings (Litrature Review, online blogposts, and tutorials)

I have read Review of Causal Discovery Methods Based on Graphical Models. I have also reviewed independence and read some parts of Causality for Machine Learning to get to know more about causality.

Experiments & Algorithms

Experiement using Jupyter notebooks

I have re-created the graphical causal model of the LUCAS database as shown below: For my experiment this week I have followed these steps:

I have created a scm (structural causal model) using the LUCAS lung cancer toy dataset, as shown below: Then, I generated dataframes using the scm as shown below as well:
Then, I have chosen a target outcome: it was Lung_Cancer for this dataset.
I have applied multiple causal discovery algorithm (e.g. PC, GES, etc.) – you can learn more about these algorithms from the readings I have posted.
After that, I extracted parents and children from the causal graph. Then, I generated dataframes for the outcome, parents, and children. These will be later used for the ERM, causal ERM, and anticausal ERM learning frameworks. The dataframes look exactly like the general dataframe but with only parents+outcome, children+outcome, or parents+children+outcome columns.
Then, I created a target dataset with different target domains (each with different distributions)
I ran my learning frameworks on the target dataset and generated a boxplot summarizing error values for anticausal+causal and causal models.

General Experience

This week has been better than General Experience This week has been better than the last week. I have continued working on the empirical experiment to prove the maths I have worked out. I have finished beginner tutorials for PyTorch. I have read through causal graphical models. I also learned about the causal discovery algorithm. Moreover, I have also reviewed neural networks and the mathematics behind it. However, I still do believe that I need a lot of practice, e.g. writing a bunch of experiments, to get used to these tools. I learned a lot from designing my first empirical experiement. In addition, I have reviewed my Python and NumPy coding skills. Overall, it has been a very enriching week, and I am excited for next week!

General Experience

I have started research on Tuesday, June 2nd. My first week was exhausting and tough. I have skimmed through two research papers: Invariant Risk Minimization and Invariant Models for Casual Transfer Learning. In addition, I have read multiple blog posts about casual inference and causal models while brushing up my multivariable calculus, linear algebra, and probability theory knowledge. I have also looked through PyTorch tutorials to get ready to run emprical experiments about our findings. I think I have learned a lot during this week. I think working online is kind of inconvenient because I am not in a close proximity to the research lab or the mentor. Although they are very responsive and approachable through online communication channels, I still think it would have been easier if we have been in an in-person setting. One of the coolest things I have read this week was “If both Newton’s apple and the planets obey the same equations, chances are that gravitation is a thing” from the IRM research paper.

Findings

In most machine learning models, data are often marred by selection biases, confounding factors, and other peculiarities. Therefore, to avoid the excessive reliance of machine learning systems on data biases, we try to leverage tools from causation to develop the mathematics of spurious and invariant correlations.
We are investifating a learning paradigm called Invariant Risk Minimization, which basically estimates a nonlinear, invariant, causal predictors from training envrionments to enable out-of-distribution (OOD) generalization. In other words, we collect data under multuple training envrionments, and we consider data from different distributions, and we want to use these data sets to learn a predictor that will perform well across a large set of unseen data but related envrionments.
The invariant causal prediction techniques search for a subset of variables that when used to esimate individual regression for each envrionment, produce regression residuals with equal distribution across all envrionments.
Transfer learning attempts to address the scenario in which distributions may change between training and testing; therefore, we want to find a set of features or predictors that can work on unseen domains of data.
I found that the expected loss function for a casual model using X and Z can be less than the expected loss function for a causal model only using X in certain conditions regarding external errors and internal biases, where X and Z are random variables. X is a causal variable, Z is an anti-causal variable. Manipulating these factors can influence how well our causal model can predict on an unseen test data.
Frustations

The mathematics of the academic litrature are complex and complicated. I took so much time to follow through them and understand them. Very tedious work for reduction and algeberaic simplification in addition to using many properties from proabability theory and linear algebra. I think this image summarizes my mathematical struggle during this week. I guess I will need sometime to get accustomed to the mathematical jargon and using the different properties of probability theory and linear algebra.

Plans for next week

I am going to read the research papers in depth, explore the github repos and the algorithms implemented in them, and try to run some tests in python. I still haven’t figured out the answer to the first problem I got, but I hope I will be able to find an answer by next week.

So far, there have not been any tests or algorithms to write, but I am expecting I will start doing that for next week. In addition to research, as a side hobby, I am learning about data analysis methods that some companies implement and trying to find connections between what we are researching and what data analysis is about. I really do believe that machine learning models are going to change the way we think about data and the world.

Ahmed Elsayed

Week 9 & 10

General Experience

Week 8

General Experience

Results and Experiments

Week 6 & 7

General Experience

Experiments and Results

Week 5

General Experience

Readings (Litrature, online blogposts, and tutorials)

Week 4

General Experience

Readings (Litrature, online blogposts, and tutorials)

Experiments & Algorithms

Week 3

General Experience

Readings (Litrature Review, online blogposts, and tutorials)

Experiments & Algorithms

Experiement using Jupyter notebooks

Week 2

General Experience

Week 1

General Experience

Findings

Frustations

Plans for next week

General Experience

General Experience

Results and Experiments

General Experience

Experiments and Results

General Experience

Readings (Litrature, online blogposts, and tutorials)

General Experience

Readings (Litrature, online blogposts, and tutorials)

Experiments & Algorithms

General Experience

Readings (Litrature Review, online blogposts, and tutorials)

Experiments & Algorithms

Experiement using Jupyter notebooks

General Experience

General Experience

Findings

Frustations

Plans for next week

Other information worth sharing!