Week 5

General Experience

I guess easiness isn’t guaranteed when it comes to research! This week has been a very challenging week. I started the week with learning about Neural Networks. I have encountered Neural Networks in my Machine Learning Class at Whitman before, and I have actually trained some models, but I haven’t gone deep into understanding them like I did this week. I have also realized the ubiquity and utility of open-source since I am currently using/reusing multiple libraries, packages, and codes from github! I have gone through the Set_Transformer repository, and it has been very helpful. In fact, I have trained a set transformer to predict certain properties of datasets –having those dataset coming from differnt distributions. In other words, let’s dismantle the i.i.d (independent and identically distributed) belief about machine learning datasets!

Readings (Litrature, online blogposts, and tutorials)

I have read Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks, and I have gone through the following tutorials for reviewing my neural networks knowledge:

  1. Deep Learning
  2. Transfer Learning
  3. Self Attention
  4. Neural Networks in Python and R

Finally, I have gone in detail through the Set_Transformer repo. It has been a lot of readings and learning this week. Very engaging, challenging, and profound. For my own personal interest, I have also gone through a paper discussing accountability for acquiring machine learning datasets: Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure.

Experiments & Algorithms

As aforementioned, I have trained a simple set transformer, taking in multi-feature dataset to predict certain properties about this dataset. So, on a high-level, I have done the following:

  1. I have generated a list of datasets. Each dataset inside the list follows a unique Structural Causal Model, consequently using different source domains.
  2. I have manually calculated the properties of interest we are looking and stored them for training and testing.
  3. I have also extracted the specific features we need from each dataset and prepared them for training and testing.
  4. Then, I created a simple set transformer and trained it against the datasets.
  5. Finally, I kept repeating, trying different loss functions, and comparing my results to the ultimate objective of the experiment.

Results and Findings

Experimental Results

I have trained my model and looked at the train loss. Here is a snip of how the results look like. image Pretty good, right! In the real world, we don’t manually create datasets with certain properties. We don’t even know what those properties are. Using this model, we are hoping to be able to predict those properties without knowing the exact distribution of our data. Then, using these predicted properties, we will be able to decide which learning/training framework we will use for our data.

Research Findings

  • Deep learning mostly tackles instance-based problems, and this is done by mapping a fixed-dimensional input tensor to its corresponding target value.
  • Self attention is basically a transformed based architecture where we draw global dependencies between inputs and outputs, and this is so important when it comes to training neural networks on data coming from different domains and different distributions.
  • The novel set-input deep neural network Set Transformer uses self-attention mechanisms to process every element in an input set, reducing the O(N^2) computation time to O(nm) where m is usually a small fixed hyperparameter (Read paper for more info).

While there aren’t a lot of important findings listed here, I think all the tutorials, blogposts, articles, etc mentioned in the Readings sections are very important and usefel. Definitely, without them, I wasn’t going to be able to understand what was going on!

Frustrations

My main frustration this week was the difficulty of using neural networks and the set Transformer. Can you imagine I once got the accuracy as 63,000%, which is Completely Wrong! I have also had a lot of learning because I needed to use a GPU instead of my laptop’s CPU. So, StackOverFlow was my friend this week to figure out how to set up libraries, github repos, packages, and files on Google Colab. So, that was helpful, yet frustrating!

Written on July 4, 2021