Proposal

Problem

Classification on with tabular using deep neural networks

Using a TabContrast approach:

Generate positive pairs by applying augmentations to your tabular data:
Implement an encoder architecture
Use NT-Xent loss:
- Maximize similarity between positive pairs
- Minimize similarity between negative pairs (other samples in batch)

I mean I could use something else, but I think this method allows for more artifically generated training data that could be beneficial.

Maybe ReConTab + TabContrast but that may be too complex

Crazy to use contrast learning on this phase?

Using an attention-based fusion approach:

For tabular data: Extract the lower-dimensional embeddings form our encoder.
For images: use our CNN to extract the lower-dimensional representation of the image.
Fusion:
- Project embeddings on the same dimension (this might require another head?)
- Implement cross-attention (… I don’t know how this works)
Pass through a final MLP for classification

Given a tabular dataset use multimodal learning where the two modalities are as follows: