Deep Learning for Electronic Health Records: Foundations, Challenges, Advances and Future Directions

Builder & Current Maintainer: Weijieying Ren, YuQing Huang, Jingxi Zhu, Zehao Liu,Tianxiang Zhao and Prof. Vasant Honavar.

Paper List

We have summarized the main branches of works for Deep tabular data representation learning, including its downstream tasks and applications. For more details, please refer to our recent survey (paper).

Branch 1: Data-Centric Approaches

Branch 2: Neural Modeling Strategies

2.1 Feature-Aware Modules

2.1.1 Discretization and Binning-Based Methods

2.1.2 Kernel-Based Methods

2.2 Model Architecture Design

2.2.1 Tree-based

2.2.2 Graph-based

2.2.3 Rule-based Models

2.2.4 Additive-model-based

2.2.5 Hierarchical and Structured Temporal Models

2.3 Temporal Dependency Modeling

2.3.1 Irregular/Asynchronous Sampling

2.3.2 Multi-Timescale Dynamics

2.3.3 Conditional Clinical Sequences

2.4 Meta-Architectural Strategies

2.4.1 Meta-Adaptive Modeling

2.4.2 Neural Architecture Search (NAS)

Branch 3: Learning-Focused Approaches

3.1 Self-Supervised Learning

3.2 Clustering-Based Methods

3.3 Latent Representation Learning

3.4 Causal Representation Learning

3.5 Continual Learning

Branch 4: Learning with External Modalities and Knowledge

4.1 Multi-modal Learning

4.1.1 Cross-model Alignment

4.1.1.1 Global Alignment

4.1.1.2 Fine-grained Alignment

4.1.1.2.1 Fine-Grained Contextual Understanding.
4.1.1.2.2 Extending 2D to 3D Imaging.
4.1.1.2.3 Region-Level Medical LMMs.

4.1.1.3 Data-Efficient Parallel and Unpaired Alignment.

4.1.1.3.1 Parallel Data Collection.
4.1.1.3.2 Learning from unpaired Data.

4.1.1.3 Data Efficient Parallel Alignment

4.1.2 Knowledge-Informed Modeling

4.1.3 Temporal and Asynchronous Integration and Modeling

4.1.4 Modality-Specific Robustness

4.2 Multi-source Learning

  • Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics.
    Jing Ma, Qiuchen Zhang, Jian Lou, Li Xiong, Joyce C. Ho
    Proceedings of the Web Conference 2021, pages 171–182.
  • 4.3 Learning with knowledge Graph

    4.4 Learning with External Data Source

    4.4.1 External Knowledge via Prompting and Instruction Tuning

    4.4.2 Internalized Knowledge from LLMs.

    1.4.3 Case-Based Knowledge from Patient Records

    Branch 5: LLM-Based Modeling and Systems

    5.1 Learning with LLMs

    5.1.1 Prompt-Based Methods

    5.1.2 Pretraining Methods

    5.1.3 Fine-Tuning Methods

    5.1.4 Retrieval-Augmented Methods

    5.2 LLM-Driven Medical Agents

    1.2.6 Masking Modeling

    1.3 Reinforcement Learning

    1.4 Temporal Modeling

    1.7 Clinical Agent

    others:Benchmark

    1.3 Learning with External Knowledge

    1.3.1 Learning with Good Model Initialization

    1.3.1 Learning with Knowledge Graph

    1.3.2 Learning with Large Language Models

    1.4 Causal Representation Learning

    Branch 2: Downstream Tasks

    2.1 Generation

    2.1.1 GAN-based Models

    2.1.2 VAE-based Models

    2.1.3 Diffusion-based Models

    2.1.4 Transformer-based

    2.1.5 Large Language Model-based

    2.1.6 Model-agnostic

    2.2 Anomaly Detection

    2.3 Transfer Learning

    2.4 Explanation/Model Assesment

    2.5: Retrieval

    2.5: Efficiency

    Branch 3: Application

    3.1 Clinical Tabular Data

    3.2 Financial Tabular Data

    Existing Surveys

    Tools & Libraries

    Last updated on March 05, 2024. (For problems, contact wjr5337@psu.edu. To add papers, please pull request at our repo)