Deep Tabular Representation Learning and its Applications: A Survey
|
Paper List
We have summarized the main branches of works for Deep tabular data representation learning, including its downstream tasks and applications. For more details, please refer to our recent survey (paper).
Branch 1: Deep Tabular Representation Learning
1.1 Heterogeneous Feature Representation Learning
1.1.1 Kernel-Based
-
Feature Learning for Interpretable, Performant Decision Trees. (paper)
Jack H. Good, Torin Kovach, Kyle Miller, Artur Dubrawski
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
Kernel Density Decision Trees. (paper)
Jack H. Good, Kyle Miller, Artur Dubrawski
The 36th AAAI Conference on Artificial Intelligence (AAAI 2022)
-
Heterogeneous Risk Minimization. (paper)
Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen
The 38th International Conference on Machine Learning (ICML 2021)
-
Kernelized Heterogeneous Risk Minimization. (paper)
Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen
The 35th Annual Conference on Neural Information Processing Systems (NeurIPS 2021)
1.1.2 Binning-Based Representation Learning
-
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains.
Kyungeun Lee, Ye Seul Sim, Hye-Seung Cho, Moonjung Eo, Suhee Yoon, Sanghyu Yoon, Woohyung Lim
The 41st International Conference on Machine Learning (ICML 2024)
-
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules.
Weijieying Ren, Xiaoting Li, Huiyuan Chen, Vineeth Rakesh, Zhuoyi Wang, Mahashweta Das, Vasant Honavar
The 41st International Conference on Machine Learning (ICML 2024)
-
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing. (paper)
Sai Praneeth Karimireddy, Lie He, Martin Jaggi
The 10th International Conference on Learning Representations (ICLR 2022)
-
On Embeddings for Numerical Features in Tabular Deep Learning. (paper)
Yury Gorishniy, Ivan Rubachev, Artem Babenko
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2022)
-
Learning Binary Decision Trees by Argmin Differentiation. (paper)
Valentina Zantedeschi, Matt J. Kusner, Vlad Niculae
The 39th International Conference on Machine Learning (ICML 2022)
1.1.3 Latent Representation Learning
-
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model.
Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu, Weijian Li, Ammar Gilani, Hsi-Sheng Goan, Han Liu
The 41st International Conference on Machine Learning (ICML 2024)
-
Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings. (paper)
Klim Kireev, Maksym Andriushchenko, Carmela Troncoso, Nicolas Flammarion
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
Revisiting Deep Learning Models for Tabular Data. (paper)
Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko.
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
1.2 Inter-Column Dependency Modeling
1.2.1 Tree-based
-
GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data. (paper)
Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt
The 12st International Conference on Learning Representations (ICLR 2024)
-
POETREE: Interpretable Policy Learning with Adaptive Decision Trees. (paper)
Alizée Pace, Alex Chan, Mihaela van der Schaar
The 10th International Conference on Learning Representations (ICLR 2022)
-
Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees. (paper)
Jonathan Brophy, Daniel Lowd
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Local Contrastive Feature Learning for Tabular Data. (paper)
Zhabiz Gharibshah, Xingquan Zhu.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM 2022)
-
Subgroup Robustness Grows On Trees: An Empirical Baseline Investigation. (paper)
Joshua P Gardner, Zoran Popovi, Ludwig Schmidt
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2022)
1.2.2 Graph-based
-
On Class Distributions Induced by Nearest Neighbor Graphs for Node Classification of Tabular Data. (paper)
Federico Errica
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- GOGGLE: Generative Modelling for Tabular Data by Learning Relational Structure.(paper)
Tennison Liu, Zhaozhi Qian, Jeroen Berrevoets, Mihaela van der Schaar
The 11th International Conference on Learning Representations (ICLR 2023)
-
HyTrel: Hypergraph-enhanced Tabular Data Representation Learning. (paper)
Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, George Karypis
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- T2G-FORMER: Organizing Tabular Features into Relation Graphs Promotes Heterogeneous Feature Interaction. (paper)
Jiahuan Yan, Jintai Chen, Yixuan Wu, Danny Z. Chen, Jian Wu
The 37th AAAI Conference on Artificial Intelligence (AAAI 2023)
- Does your graph need a confidence boost? Convergent boosted smoothing on graphs with tabular node features.(paper)
Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf
The 10th International Conference on Learning Representations (ICLR 2022)
- Table2Graph: Transforming Tabular Data to Unified Weighted Graph. (paper)
Kaixiong Zhou, Zirui Liu, Rui Chen, Li Li, Soo-Hyun Choi, Xia Hu
The 31st International Joint Conference on Artificial Intelligence. (IJCAI 2022)
1.2.3 Rule-based Method
-
Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators. (paper)
Scott C Lowe, Robert Earle, Jason d'Eon, Thomas Trappenberg, Sageev Oore.
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
1.2.4 Transformer-based
- Polynomial-based Self-Attention for Table Representation Learning.
Jayoung Kim, Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park
The 41th International Conference on Machine Learning (ICML 2024)
- Arithmetic Feature Interaction Is Necessary for Deep Tabular Learning. (paper)
Yi Cheng; Renjun Hu; Haochao Ying; Xing Shi; Jian Wu; Wei Lin
The 38th AAAI Conference on Artificial Intelligence (AAAI 2024)
- Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.(paper)
Huawen Shen, Xiang Gao, Jin Wei, Liang Qiao, Yu Zhou, Qiang Li, Zhanzhan Cheng
the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)
- DIWIFT: Discovering Instance-wise Influential Features for Tabular Data.
Dugang Liu, Pengxiang Cheng, Hong Zhu, Xing Tang, Yanyu Chen, Xiaoting Wang, Weike Pan, Zhong Ming, Xiuqiang He
(TheWebConf 2023) (paper)
-
TransTab: Learning Transferable Tabular Transformers Across Tables.
Zifeng Wang, Jimeng Sun.
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022) (paper)
1.2.5 Additive-model-based
-
Sparse Interaction Additive Networks via Feature Interaction Detection and Sparse Selection.
James Enouen, Yan Liu.
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022) (paper)
1.2.6 Masking Modeling
-
MCM: Masked Cell Modeling for Anomaly Detection in Tabular Data.
Jiaxin Yin, Yuanyuan Qiao, Zitang Zhou, Xiangchao Wang, Jie Yang
The 12st International Conference on Learning Representations (ICLR 2024).
(paper)
-
Data-Efficient and Interpretable Tabular Anomaly Detection.
Chun-Hao Chang, Jinsung Yoon, Sercan Arik, Madeleine Udell, Tomas Pfister
The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023) (paper)
-
Anomaly detection for tabular data with internal contrastive learning.
Tom Shenkar, Lior Wolf
The 10th International Conference on Learning Representations (ICLR 2022)
(paper)
1.2.7 Neural Architecture Search
-
TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets. (paper)
Chengrun Yang, Gabriel Bender, Hanxiao Liu, Pieter-Jan Kindermans, Madeleine Udell, Yifeng Lu, Quoc V Le, Da Huang
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
1.3 Self-supervised Representation Learning
- SwitchTab: Switched Autoencoders Are Effective Tabular Learners. (paper)
Jing Wu; Suiyao Chen; Qi Zhao; Renat Sergazinov; Chen Li; Shengjie Liu.
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
- STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables.
Jaehyun Nam, Jihoon Tack, Kyungmin Lee, Hankook Lee, Jinwoo Shin
The 11th International Conference on Learning Representations (ICLR 2023)
(paper)
- Self-Supervision Enhanced Feature Selection with Correlated Gates. (paper)
Changhee Lee, Fergus Imrie, Mihaela van der Schaar
The 10th International Conference on Learning Representations (ICLR 2022)
- Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption. (paper)
Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler
The 10th International Conference on Learning Representations (ICLR 2022)
- Local Contrastive Feature Learning for Tabular Data. (paper)
Zhabiz Gharibshah, Xingquan Zhu.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM 2022)
- CORE: Self- and Semi-supervised Tabular Learning with COnditional REgularizations. (paper)
Xintian Han, Rajesh Ranganath
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
- SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning. (paper)
Talip Ucar, Ehsan Hajiramezanali, Lindsay Edwards
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
- VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain. (paper)
Jinsung Yoon, Yao Zhang, James Jordon, Mihaela van der Schaar
The 34th Annual Conference on Neural Information Processing Systems (Neurips 2020)
1.4 Clustering-based Representation Learning
-
ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data.
Xiangjian Jiang, Andrei Margeloiu, Nikola Simidjievski, Mateja Jamnik
The 41th International Conference on Machine Learning (ICML 2024)
-
PTaRL: Prototype-based Tabular Representation Learning via Space Calibration.
Hangting Ye, Wei Fan, Xiaozhuang Song, Shun Zheng, He Zhao, Dan dan Guo, Yi Chang
The 12th International Conference on Learning Representations (ICLR 2024)
(paper)
-
TabR: Tabular Deep Learning Meets Nearest Neighbors. (paper)
Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev, Daniil Shlenskii, Akim Kotelnikov, Artem Babenko
The 12th International Conference on Learning Representations (ICLR 2024).
-
On Class Distributions Induced by Nearest Neighbor Graphs for Node Classification of Tabular Data. (paper)
Federico Errica
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data. (paper)
Nabeel Seedat, Jonathan Crabbé, Ioana Bica, Mihaela van der Schaar
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
-
Learning Enhanced Representation for Tabular Data via Neighborhood Propagation. (paper)
Kounianhua Du, Weinan Zhang, Ruiwen Zhou, Yangkun Wang, Xilong Zhao, Jiarui Jin, Quan Gan, Zheng Zhang, David Wipf
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
1.2 Multi-modality Representation Learning
- TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Binghong Wu, Lei Liao, Shu Wei, Yongjie Ye, Hao Liu, Wengang Zhou, Houqiang Li, Can Huang
The 38th Annual Conference on Neural Information Processing Systems (Neurips 2024)
1.3 Learning with External Knowledge
1.3.1 Learning with Good Model Initialization
- XTab: Cross-table Pretraining for Tabular Transformers. (paper)
Bingzhao Zhu, Xingjian Shi, Nick Erickson, Mu Li, George Karypis, Mahsa Shoaran
The 40th International Conference on Machine Learning (ICML 2023)
- Numerical Tuple Extraction from Tables with Pre-training. (paper)
Qingping Yang, Yixuan Cao, Yingming Hu, Jianfeng Li, Nanbo Peng, Ping Luo.
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
- Well-tuned Simple Nets Excel on Tabular Datasets. (paper)
Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
1.3.1 Learning with Knowledge Graph
-
High dimensional, tabular deep learning with an auxiliary knowledge graph. (paper)
Camilo Ruiz, Hongyu Ren, Kexin Huang, Jure Leskovec
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- External Knowledge Infusion for Tabular Pre-training Models with Dual-adapters. (paper)
Can Qin, Sungchul Kim, Handong Zhao, Tong Yu, Ryan A. Rossi, Yun Fu.
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
1.3.2 Learning with Large Language Models
- TableRAG: Million-Token Table Understanding with Language Models.
Si-An Chen, Lesly Miculicich, Julian Eisenschlos, Zifeng Wang, Zilong Wang, Yanfei Chen, YASUHISA FUJII, Hsuan-Tien Lin, Chen-Yu Lee, Tomas Pfister.
The 38th Annual Conference on Neural Information Processing Systems (Neurips 2024)
- Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning.
Sungwon Han, Jinsung Yoon, Sercan Arik, Tomas Pfister
The 41th International Conference on Machine Learning (ICML 2024)
- OpenTab: Advancing Large Language Models as Open-domain Table Reasoners. (paper)
Kezhi Kong, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Chuan Lei, Christos Faloutsos, Huzefa Rangwala, George Karypis
The 12th International Conference on Learning Representations (ICLR 2024)
- Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources. (paper)
Xingxuan Li, Ruochen Zhao, Yew Ken Chia, Bosheng Ding, Shafiq Joty, Soujanya Poria, Lidong Bing
The 12th International Conference on Learning Representations (ICLR 2024)
- Making Pre-trained Language Models Great on Tabular Prediction. (paper)
Jiahuan Yan, Bo Zheng, Hongxia Xu, Yiheng Zhu, Danny Chen, Jimeng Sun, Jian Wu, Jintai Chen
The 12th International Conference on Learning Representations (ICLR 2024)
-
UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science. (paper)
Yazheng Yang, Yuqi Wang, Guang Liu, Ledell Wu, Qi Liu
The 12th International Conference on Learning Representations (ICLR 2024)
- Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding. (paper)
Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister
The 12th International Conference on Learning Representations (ICLR 2024)
-
Parameterizing Context: Unleashing the Power of Parameter-Efficient Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing. (paper)
Yongrui Chen, Shenyu Zhang, Guilin Qi, Xinnan Guo
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- Trompt: Towards a Better Deep Neural Network for Tabular Data. (paper)
Kuan-Yu Chen, Ping-Han Chiang, Hsin-Rung Chou, Ting-Wei Chen, Tien-Hao Chang
The 40th International Conference on Machine Learning (ICML 2023)
-
Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering.(paper)
Noah Hollmann, Samuel Müller, Frank Hutter
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
Language Models are Weak Learners. (paper)
Hariharan Manikandan, Yiding Jiang, J Zico Kolter
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data. (paper)
Xiao Li, Yin Zhu, Sichen Liu, Jiangzhou Ju, Yuzhong Qu, Gong Cheng
The 37th AAAI Conference on Artificial Intelligence (AAAI 2023)
- Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions. (paper)
Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu Sun, Yang Yang, David A. Clifton
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions. (paper)
Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu Sun, Yang Yang, David A. Clifton
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
-
TAPEX: Table Pre-training via Learning a Neural SQL Executor.(paper)
Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou
The 10th International Conference on Learning Representations (ICLR 2022)
1.4 Causal Representation Learning
- TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. (paper)
Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter
The 11th International Conference on Learning Representations (ICLR 2023)
- Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment. (paper)
Chenxiao Yang, Qitian Wu, Qingsong Wen, Zhiqiang Zhou, Liang Sun, Junchi Yan
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation. (paper)
Ioana Bica, Mihaela van der Schaar
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes. (paper)
Zhaozhi Qian, Yao Zhang, Ioana Bica, Angela Mary Wood, Mihaela van der Schaar
The 34th Annual Conference on Neural Information Processing Systems (Neurips 2021)
Branch 2: Downstream Tasks
2.1 Generation
2.1.1 GAN-based Models
- Invertible Tabular GANs: Killing Two Birds with One Stone for Tabular Data Synthesis. (paper)
JAEHOON LEE, Jihyeon Hyeong, Jinsung Jeon, Noseong Park, Jihoon Cho
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
- DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks. (paper)
Boris van Breugel, Trent Kyono, Jeroen Berrevoets, Mihaela van der Schaar
The 35th Annual Conference on Neural Information Processing Systems (Neurips 2021)
2.1.2 VAE-based Models
- A Learnable Discrete-Prior Fusion Autoencoder with Contrastive Learning for Tabular Data Synthesis. (paper)
Rongchao Zhang, Yiwei Lou, Dexuan Xu, Yongzhi Cao, Hanpin Wang, Yu Huang
The 38th AAAI Conference on Artificial Intelligence (AAAI 2024)
2.1.3 Diffusion-based Models
- Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees. (paper)
Alexia Jolicoeur, Martineau Kilian Fatras, Tal Kachmana Rangwala, George Karypis
(AISTATS 2024)
- Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space.(paper)
Hengrui Zhang, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Xiao Qin, Christos Faloutsos, Huzefa Rangwala, George Karypis
The 12th International Conference on Learning Representations (ICLR 2024)
- A Flexible Generative Model for Heterogeneous Tabular EHR with Missing Modality. (paper)
Huan He, William hao, Yuanzhe Xi, Yong Chen, Bradley Malin, Joyce Ho
The 12th International Conference on Learning Representations (ICLR 2024)
- CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis. (paper)
Chaejeong Lee, Jayoung Kim, Noseong Park
The 40th International Conference on Machine Learning (ICML 2023)
- TabDDPM: Modelling Tabular Data with Diffusion Models. (paper)
Akim Kotelnikov, Dmitry Baranchuk, Ivan Rubachev, Artem Babenko
The 40th International Conference on Machine Learning (ICML 2023)
- Concrete Score Matching: Generalized Score Matching for Discrete Data. (paper)
Chenlin Meng, Kristy Choi, Jiaming Song, Stefano Ermon
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- SOS: Score-based Oversampling Minor Classes for Tabular Data. (paper)
jayoung kim, ChaeJeong Lee,Yehjin Shin, Sewon Park, Minjung Kim, Noseong Park, Jihoon Cho
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
2.1.4 Transformer-based
-
ReMasker: Imputing Tabular Data with Masked Autoencoding. (paper)
Tianyu Du, Luca Melis, Ting Wang
The 12th International Conference on Learning Representations (ICLR 2024)
-
TabMT: Generating tabular data with masked transformers. (paper)
Manbir S Gulati, Paul F Roysdon
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
Realtabformer: Generating realistic relational and tabular data using transformers. (paper)
Aivin V. Solatorio, Olivier Dupriez
(Arxiv 2023)
-
Fata-trans: Field and time-aware transformer for sequential tabular data. (paper)
Dongyu Zhang, Liang Wang, Xin Dai, Shubham Jain, Junpeng Wang, Yujie Fan, Chin-Chia Michael Yeh, Yan Zheng, Zhongfang Zhuang, Wei Zhang
The 32nd ACM International Conference on Information and Knowledge Management (CIKM 2023)
2.1.5 Large Language Model-based
-
CuTS: Customizable Tabular Synthetic Data Generation.
Mark Vero, Mislav Balunovic, Martin Vechev
The 41th International Conference on Machine Learning (ICML 2024)
-
Language-Interfaced Tabular Oversampling via Progressive Imputation and Self-Authentication. (paper)
June Yong Yang, Geondo Park, Joowon Kim, Hyeongwon Jang, Eunho Yang
The 12th International Conference on Learning Representations (ICLR 2024)
- Language Models are Realistic Tabular Data Generators. (paper)
Vadim Borisov, Kathrin Seßler2, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
The 11th International Conference on Learning Representations (ICLR 2023)
- Generative Table Pre-training Empowers Models for Tabular Prediction. (paper)
Tianping Zhang, Shaowen Wang, Shuicheng Yan, Jian Li, Qian Liu
(EMNLP 2023)
- nBIIG: A Neural BI Insights Generation System for Table Reporting.
Yotam Perlitz, Dafna Sheinwald, Noam Slonim, Michal Shmueli-Scheuer
The 37th AAAI Conference on Artificial Intelligence (AAAI 2023) (paper)
2.1.6 Model-agnostic
-
How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data.
Mihaela C Stoian, Salijona Dyrmishi, Maxime Cordy, Thomas Lukasiewicz, Eleonora Giunchiglia
The 12th International Conference on Learning Representations (ICLR 2024, interesting!) (paper)
2.2 Anomaly Detection
-
Beyond Individual Input for Deep Anomaly Detection on Tabular Data.
Hugo Thimonier, Fabrice Popineau, Arpad Rimmel, Bich-Liên DOAN
The 41st International Conference on Machine Learning (ICML 2024).
-
MCM: Masked Cell Modeling for Anomaly Detection in Tabular Data.
Jiaxin Yin, Yuanyuan Qiao, Zitang Zhou, Xiangchao Wang, Jie Yang
The 12th International Conference on Learning Representations (ICLR 2024).(paper)
-
SemanticMask: A Contrastive View Design for Anomaly Detection in Tabular Data. (paper)
Shuting Tao, Tongtian Zhu, Hongwei Wang, Xiangming Meng
the 33th International Joint Conference on Artificial Intelligence (IJCAI 2024).
2.3 Transfer Learning
-
Tabular Insights, Visual Impacts: Transferring Expertise from Tables to Images.
Jun-Peng Jiang, Han-Jia Ye, Leye Wang, Yang Yang, Yuan Jiang, De-Chuan Zhan.
The 41st International Conference on Machine Learning (ICML 2024).
-
CARTE: Pretraining and Transfer for Tabular Learning.
Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux
The 41st International Conference on Machine Learning (ICML 2024).
-
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules.
Weijieying Ren, Xiaoting Li, Huiyuan Chen, Vineeth Rakesh, Zhuoyi Wang, Mahashweta Das, Vasant Honavar
The 41st International Conference on Machine Learning (ICML 2024).
-
Towards Cross-Table Masked Pretraining for Web Data Mining.
Chao Ye, Guoshan Lu, Haobo Wang, Liyao Li, Sai Wu, Gang Chen, Junbo Zhao
(WWW 2024).(paper)
-
MCM: Masked Cell Modeling for Anomaly Detection in Tabular Data. (paper)
Jiaxin Yin, Yuanyuan Qiao, Zitang Zhou, Xiangchao Wang, Jie Yang
The 12th International Conference on Learning Representations (ICLR 2024).
- XTab: Cross-table Pretraining for Tabular Transformers. (paper)
Bingzhao Zhu, Xingjian Shi, Nick Erickson, Mu Li, George Karypis, Mahsa Shoaran
The 40th International Conference on Machine Learning (ICML 2023)
-
TransTab: Learning Transferable Tabular Transformers Across Tables. (paper)
Zifeng Wang, Jimeng Sun.
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- External Knowledge Infusion for Tabular Pre-training Models with Dual-adapters. (paper)
Can Qin, Sungchul Kim, Handong Zhao, Tong Yu, Ryan A. Rossi, YUN FU.
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
- Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation.
Ioana Bica, Mihaela van der Schaar
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022) (paper)
2.4 Explanation/Model Assesment
- Interpretable Deep Clustering for Tabular Data.
Jonathan Svirsky, Ofir Lindenbaum
The 41st International Conference on Machine Learning (ICML 2024)
- InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation.
Jacob Si, Wendy Yusi Cheng, Michael Cooper, Rahul G. Krishnan
The 41st International Conference on Machine Learning (ICML 2024)
- Do Machine Learning Models Learn Statistical Rules Inferred from Data? (paper)
Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
The 40th International Conference on Machine Learning (ICML 2023)
-
An Inductive Bias for Tabular Deep Learning. (paper)
Ege Beyazit, Jonathan Kozaczuk, Bo Li, Vanessa Wallace, Bilal H Fadlallah
The 37th Annual Conference on Neural Information Processing Systems (NeurIPS 2023)
-
TabCBM: Concept-based Interpretable Neural Networks for Tabular Data. (paper)
Mateo Espinosa Zarlenga, Zohreh Shams, Michael Edward Nelson, Been Kim, and Mateja Jamnik (TMLR 2023)
- Tabular Data: Deep Learning is Not All You Need. (paper)
Ravid Shwartz-Ziv, Amitai Armon.
(Information Fusion 2022)
- Revisiting Deep Learning Models for Tabular Data. (paper)
Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko
The 34th Annual Conference on Neural Information Processing Systems (Neurips 2021)
2.5: Retrieval
- TabR: Tabular Deep Learning Meets Nearest Neighbors.(paper)
Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev, Daniil Shlenskii, Akim Kotelnikov, Artem Babenko
The 12th International Conference on Learning Representations (ICLR 2024)
- Dense Representation Learning and Retrieval for Tabular Data Prediction.(paper)
Lei Zheng, Ning Li, Xianyu Chen, Quan Gan, Weinan Zhang
The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023)
2.5: Efficiency
- TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge.(paper)
Huanan LI, Juntao Guan, Lai Rui, Sijun Ma, Lin Gu, Noperson
The 38th Annual Conference on Neural Information Processing Systems (Neurips 2024)
- FHyperFast: Instant Classification for Tabular Data.(paper)
David Bonet; Daniel Mas Montserrat; Xavier Giró-i-Nieto; Alexander Ioannidis
The 38th AAAI Conference on Artificial Intelligence (AAAI 2024)
- TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. (paper)
Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter
The 11th International Conference on Learning Representations (ICLR 2023)
-
Clustering the Sketch: Dynamic Compression for Embedding Tables. (paper)
Henry Tsang, Thomas Dybdahl Ahle
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- Compressing Tabular Data via Latent Variable Estimation. (paper)
Andrea Montanari, Eric Weiner
The 40th International Conference on Machine Learning (ICML 2023)
Branch 3: Application
3.1 Clinical Tabular Data
- EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records.
Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi.
The 38th Annual Conference on Neural Information Processing Systems (Neurips 2024)
- Contrastive Learning for Clinical Outcome Prediction with Partial Data Sources.
Xia, Jonathan Wilson, Benjamin Goldstein, Ricardo Henao.
The 41th International Conference on Machine Learning (ICML 2024)
- Safe Exploration in Dose Finding Clinical Trials with Heterogeneous Participants.
Isabel Chien, Wessel Bruinsma, Javier Gonzalez, Richard E Turner.
The 41th International Conference on Machine Learning (ICML 2024)
- Weight Predictor Network with Feature Selection for Small Sample Tabular Biomedical Data. (paper)
Andrei Margeloiu, Nikola Simidjievski, Pietro Liò, Mateja Jamnik.
The 37th AAAI Conference on Artificial Intelligence (AAAI 2023)
- Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling.(paper)
Xinlu Zhang, Shiyang Li, Zhiyu Chen, Xifeng Yan, Linda Ruth Petzold
The 40th International Conference on Machine Learning (ICML 2023)
-
Towards Semi-Structured Automatic ICD Coding via Tree-based Contrastive Learning. (paper)
Chang Lu, Chandan K. Reddy, Ping Wang, Yue Ning
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
-
MOTOR: A Time-to-Event Foundation Model For Structured Medical Records. (paper)
Ethan Steinberg, Jason Alan Fries, Yizhe Xu, Nigam Shah
The 37th Annual Conference on Neural Information Processing Systems (Neurips 2023)
- Locally Sparse Neural Networks for Tabular Biomedical Data. (paper)
Junchen Yang, Ofir Lindenbaum, Yuval Kluger
The 39th International Conference on Machine Learning (ICML 2022)
- Evaluating Latent Space Robustness and Uncertainty of EEG-ML Models under Realistic Distribution Shifts.(paper)
Neeraj Wagh, Jionghao Wei, Samarth Rawal, Brent M. Berry, Yogatheesan Varatharajah
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Debiased, Longitudinal and Coordinated Drug Recommendation through Multi-Visit Clinic Records. (paper)
Hongda Sun, Shufang Xie, Shuqi Li, Yuhan Chen, Ji-Rong Wen, Rui Yan
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions. (paper)
Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu Sun, Yang Yang, David A. Clifton
The 36th Annual Conference on Neural Information Processing Systems (Neurips 2022)
- Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis. (paper)
Siyi Tang, Jared Dunnmon, Khaled Kamal Saab, Xuan Zhang, Qianying Huang, Florian Dubost, Daniel Rubin, Christopher Lee-Messer
The 10th International Conference on Learning Representations (ICLR 2022)
- SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes. (paper)
Zhaozhi Qian, Yao Zhang, Ioana Bica, Angela Mary Wood, Mihaela van der Schaar
The 34th Annual Conference on Neural Information Processing Systems (Neurips 2021)
3.2 Financial Tabular Data
- Beyond Pure Text: Summarizing Financial Reports Based on Both Textual and Tabular Data. (paper)
Ziao Wang, Zelin Jiang, Xiaofeng Zhang, Jaehyeon Soon, Jialu Zhang, Wang Xiaoyao, Hongwei Du
The 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)
Existing Surveys
-
Explainable Artificial Intelligence for Tabular Data: A Survey
Sahakyan M, Aung Z, Rahwan T
IEEE Access 2021
-
The use of generative adversarial networks to alleviate class imbalance in tabular data: a survey
Sauber-Cole, Rick, and Taghi M. Khoshgoftaar
Journal of Big Data 2022
-
Deep Neural Networks and Tabular Data: A Survey.(paper)
Vadim Borisov, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawelczyk, Gjergji Kasneci
IEEE transactions on neural networks and learning systems 2022
-
Synthetic data generation for tabular health records: A systematic review
Hernandez, M., Epelde, G., Alberdi, A., Cilla, R., & Rankin, D.
Neurocomputing 2022
-
Embeddings for Tabular Data: A Survey. (paper)
Rajat Singh, Srikanta Bedathur
Arxiv 2023
-
Transformers for Tabular Data Representation: A Survey of Models and Applications. (paper)
Gilbert Badaro, Mohammed Saeed, Paolo Papotti
TACL 2023
-
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data. (paper)
Wei-Yao Wang1, Wei-Wei Du2,, Derek Xu, Wei Wang, Wen-Chih Peng
Arxiv 2024
-
Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution
Yucheng Ruan, Xiang Lan, Jingying Ma, Yizhi Dong, Kai He, Mengling Feng
Arvix 2024
-
Large language models (LLMs) on tabular data: Prediction, generation, and understanding - a survey
Xi Fang, Weijie Xu, Fiona Anting Tan, Jiani Zhang, Ziqing Hu, Yanjun (Jane) Qi, Scott Nickleach, Diego Socolinsky, Srinivasan Sengamedu, "SHS", Christos Faloutsos
Arvix 2024
Tools & Libraries
- Pytorch Frame: A modular deep learning framework for building neural network models on heterogeneous tabular data. (paper)
- PyTorch Tabular: A Framework for Deep Learning with Tabular Data.
Last updated on March 05, 2024.
(For problems, contact wjr5337@psu.edu. To add papers, please pull request at our repo)