Development build for ashkan-pirmani/fl-kit@79a62ab (branch: dev-0.1)
Skip to content Skip to footer

Federated Learning Glossary

Terminology in Federated Learning can be complex and context-specific. The glossary presents clear, concise definitions of key concepts and technical terms to ensure conceptual clarity and promote consistent understanding across disciplines.

Term Description Category
Active Learning Model selects most informative data points for labeling. General ML
Adverse Event Unintended medical occurrence during treatment or study. Clinical/Healthcare
Algorithmic Fairness Ensuring machine learning models avoid biased or discriminatory outcomes. General ML
Alignment Matching data or models to a reference standard. Bioinformatics
Allele Frequency Proportion of a specific allele among all alleles in a population. Bioinformatics
Anonymization Removing or masking personal identifiers from data. Data & Privacy
Asynchronous Federated Learning FL where clients update models at different times, unsynchronized. Federated Learning
AutoFE in Federated Learning Automated feature engineering adapted for federated learning settings. Federated Learning
AutoML Automated process of model selection, training, and tuning. General ML
Batch Effect Systematic differences between data batches, often in biomedical data. Bioinformatics
Bias Mitigation Techniques to reduce bias in machine learning models. General ML
Bioinformatics Application of computational tools to biological data. Bioinformatics
Biomarker Biological molecule indicating a process, condition, or disease. Clinical/Healthcare
Blockchain in FL Using blockchain for secure, transparent model updates in FL. Federated Learning
Bootstrapping Resampling technique for estimating statistics or model performance. Analytics
BYOD (Bring Your Own Data) Participants contribute their own data to collaborative analysis. Data & Privacy
Byzantine-Robust Aggregation Aggregation methods resilient to malicious or faulty clients in FL. Federated Learning
Casemix Measuring clinical activity based on patient characteristics for reimbursement. Clinical/Healthcare
Centralized Learning Model training with all data collected in one location. General ML
ChIP-Seq Technique to analyze protein interactions with DNA. Bioinformatics
Client Clustering Grouping clients with similar data distributions in FL to enhance performance. Federated Learning
Clinical Decision Support System (CDSS) System providing clinicians with knowledge to enhance patient care decisions. Clinical/Healthcare
Clinical Trial Research study to evaluate medical, surgical, or behavioral interventions. Clinical/Healthcare
Cohort Study Observational study following a group over time. Clinical/Healthcare
Common Data Model (CDM) Standardized structure for organizing data to facilitate sharing and analysis. Data & Privacy
Communication-Efficient Algorithms FL algorithms designed to minimize communication overhead. Federated Learning
Consent Management Handling patient permissions for data use and sharing. Data & Privacy
Continuous Learning Model updates as new data arrives, without retraining from scratch. General ML
Cross-Silo Federated Learning FL among organizations (e.g., hospitals) with large datasets. Federated Learning
Cross-Validation Splitting data into folds to assess model performance. Analytics
Data Acquisition Gathering data from various sources for analysis. Data & Privacy
Data Anonymization Removing identifiable information from datasets to protect privacy. Data & Privacy
Data Augmentation Creating new data samples by modifying existing ones. General ML
Data Cleaning Correcting or removing erroneous data to improve quality. Data & Privacy
Data Dictionary Descriptive list of data elements in a system or database. Data & Privacy
Data Drift Change in data distribution over time, affecting model performance. Analytics
Data Federation Sharing data from distributed sources without centralization. Data & Privacy
Data Governance Managing data availability, usability, integrity, and security. Data & Privacy
Data Harmonization Standardizing data from multiple sources to a common format. Data & Privacy
Data Imputation Filling in missing data values using statistical methods. Analytics
Data Integration Combining data from different sources into a unified view. Data & Privacy
Data Lake Centralized repository for storing raw, unstructured data. Data & Privacy
Data Leakage Unintended exposure of information from outside the training dataset. Data & Privacy
Data Lineage Tracking data origin and transformations throughout its lifecycle. Data & Privacy
Data Minimization Limiting data collection to only what is necessary. Data & Privacy
Data Preprocessing Preparing data for analysis through normalization and encoding. Analytics
Data Provenance Documentation of data origins and processing history. Data & Privacy
Data Quality Measure of data’s accuracy, completeness, and reliability. Data & Privacy
Data Stewardship Overseeing data assets to ensure quality and compliance. Data & Privacy
Data Use Agreement (DUA) Contract governing data sharing and usage between parties. Data & Privacy
Data Wrangling Cleaning and transforming raw data into a usable format. Analytics
Deep Learning Machine learning using neural networks with multiple layers. General ML
De-identification Removing or obscuring personal identifiers from data. Data & Privacy
Descriptive Analytics Examining data to understand past events and trends. Analytics
Differential Expression Identifying genes expressed differently between conditions. Bioinformatics
Differential Privacy Ensuring outputs do not reveal individual data points. Security/Privacy
Digital Biomarker Digital data indicating health status or disease progression. Clinical/Healthcare
Digital Pathology Analysis of digitized pathology slides using computational methods. Clinical/Healthcare
Digital Twin Virtual representation of a patient or system for simulation. Clinical/Healthcare
Distributed Learning Model training across multiple locations or devices. General ML
DNA Sequencing Determining the order of nucleotides in DNA. Bioinformatics
Edge Computing in FL Performing FL computations on edge devices to reduce latency. Federated Learning
Electronic Health Record (EHR) Digital record of a patient’s medical history. Clinical/Healthcare
Electronic Medical Record (EMR) Digital version of a patient’s paper chart. Clinical/Healthcare
Ensemble Learning Combining multiple models to improve prediction accuracy. General ML
Ethics Board Committee overseeing ethical aspects of research and data use. Clinical/Healthcare
Exploratory Data Analysis (EDA) Summarizing main characteristics of datasets through analysis. Analytics
Explainable AI (XAI) AI systems whose decisions can be understood by humans. General ML
FAIR Principles Guidelines for making data Findable, Accessible, Interoperable, Reusable. Data & Privacy
Federated Analytics Analyzing distributed data without moving or centralizing it. Federated Learning
Federated Averaging (FedAvg) FL algorithm averaging local model parameters for global updates. Federated Learning
Federated Feature Engineering Feature engineering in FL without sharing raw data. Federated Learning
Federated Learning ML training across decentralized devices without data exchange. Federated Learning
Federated One-Shot Analysis Single-round federated analysis without iterative communication. Federated Learning
Federated Query Querying distributed datasets without centralizing data. Federated Learning
FedProx FL algorithm improving performance on non-IID data. Federated Learning
FHIR Standard for electronic healthcare information exchange. Clinical/Healthcare
Genotype Genetic makeup of an organism. Bioinformatics
Genome-Wide Association Study (GWAS) Study associating genetic variants with traits or diseases. Bioinformatics
Generalization Model’s ability to perform well on unseen data. General ML
Gradient Leakage Attack reconstructing training data from shared gradients. Security/Privacy
Health Information Exchange (HIE) Electronic sharing of health-related information among organizations. Clinical/Healthcare
HL7 Standards for transferring clinical and administrative data. Clinical/Healthcare
Homomorphic Encryption Encryption allowing computations on encrypted data without decryption. Security/Privacy
Horizontal Federated Learning FL with same features but different samples across clients. Federated Learning
Horizontally Partitioned Data Data with different rows stored in different locations. Data & Privacy
Hyperparameter Parameter set before training, not learned from data. General ML
ICD-10 International classification system for diseases and health conditions. Clinical/Healthcare
Imbalanced Data Datasets where some classes are underrepresented. Analytics
Informed Consent Patient agreement for data use in research. Clinical/Healthcare
Interoperability Ability of systems to exchange and use information. Data & Privacy
k-anonymity Ensuring records are indistinguishable from at least k-1 others. Security/Privacy
Key Performance Indicators (KPIs) Metrics evaluating organizational or activity success. Analytics
Label Noise Incorrect or inconsistent labels in training data. Analytics
Label Propagation Spreading labels from labeled to unlabeled data points. General ML
Latency Delay between input and response in a system. Analytics
Local Differential Privacy Privacy protection applied at the data source before sharing. Security/Privacy
Longitudinal Study Research collecting data from the same subjects over time. Clinical/Healthcare
Machine Learning Enabling computers to learn from data without explicit programming. General ML
Medical Imaging Creating visual representations of the interior of a body. Clinical/Healthcare
Membership Inference Attacks to identify if data was used in training. Security/Privacy
Meta-Learning in Federated Learning Meta-learning for fast adaptation of FL global models. Federated Learning
Metabolomics Study of chemical processes involving metabolites. Bioinformatics
Minimum Data Set Smallest set of data elements for a specific purpose. Data & Privacy
mHealth Using mobile devices for medicine and public health. Clinical/Healthcare
Model Compression Reducing model size for efficiency. General ML
Model Deployment Integrating machine learning models into production environments. General ML
Model Drift Model performance degrades due to changing data. Analytics
Model Evaluation Assessing model performance using metrics like accuracy. Analytics
Model Explainability Ability to interpret and understand model predictions. General ML
Model Personalization Adapting FL global models to individual client data. Federated Learning
Model Poisoning Malicious client updates degrading FL global models. Security/Privacy
Model Selection Choosing the best machine learning model for a task. General ML
Model Training Teaching machine learning models using data. General ML
Multi-Omics Integrative analysis of multiple omics data types. Bioinformatics
Multi-Task Learning Training models on multiple related tasks simultaneously. General ML
Neural Network Computational model inspired by the human brain. General ML
Next-Generation Sequencing (NGS) High-throughput DNA sequencing technologies. Bioinformatics
Non-IID Data Data not independently and identically distributed across clients. Federated Learning
OHDSI Community developing standards for observational health data. Clinical/Healthcare
Omics Data Large-scale datasets from genomics, proteomics, etc. Bioinformatics
One-Shot Federated Learning FL training global model in a single communication round. Federated Learning
Ontology Structured vocabulary for a domain, enabling data integration. Bioinformatics
Overfitting Model learns training data too well, performs poorly on new data. General ML
Patient Cohort Group of patients sharing common characteristics. Clinical/Healthcare
Patient Similarity Learning Identifying similar patients for diagnosis or treatment planning. Clinical/Healthcare
Pathology Informatics Application of informatics in pathology for data management and analysis. Clinical/Healthcare
Personal Health Record (PHR) Health record managed and controlled by the patient. Clinical/Healthcare
Personalized Federated Learning (PFL) FL customizing models for each client’s data. Federated Learning
Personally Identifiable Information (PII) Data that can identify an individual. Data & Privacy
Pharmacogenomics Study of how genes affect drug response. Bioinformatics
Phenotype Observable characteristics of an organism. Bioinformatics
Predictive Analytics Predicting future events using data analysis. Analytics
Prescriptive Analytics Recommending actions for optimal outcomes using data. Analytics
Privacy by Design Incorporating privacy into system design from the start. Security/Privacy
Privacy-Preserving Computation Computations that protect private data. Security/Privacy
Proteomics Study of the structure and function of proteins. Bioinformatics
Pseudonymization Replacing private identifiers with fake identifiers. Security/Privacy
Quality Assurance (QA) Ensuring data and processes meet defined quality standards. Analytics
Quality Control (QC) Operational techniques to fulfill quality requirements. Analytics
Real-World Data (RWD) Data collected from routine clinical practice. Clinical/Healthcare
Real-World Evidence (RWE) Clinical evidence from real-world data analysis. Clinical/Healthcare
Reproducibility Ability to obtain consistent results using the same data and methods. Analytics
Scaffold FL algorithm reducing client drift using control variates. Federated Learning
Secure Aggregation Protocol ensuring only aggregated updates are visible to the server. Security/Privacy
Secure Enclave Hardware-based secure area for sensitive computations. Security/Privacy
Secure Multi-Party Computation Cryptographic protocol for private multi-party computations. Security/Privacy
Semi-Supervised Learning ML using both labeled and unlabeled data. General ML
SHAP Values Method for explaining individual model predictions. Analytics
Single-Cell Analysis Study of gene expression at the single-cell level. Bioinformatics
SNOMED CT Standardized clinical terminology for electronic health records. Clinical/Healthcare
Synthetic Data Artificially generated data resembling real data. Data & Privacy
Supervised Learning ML using labeled data to train models. General ML
Swarm Learning Decentralized ML using blockchain for coordination. Federated Learning
Telemedicine Remote diagnosis and treatment via telecommunications. Clinical/Healthcare
Test Data Dataset for evaluating trained model performance. Analytics
Tokenization Converting sensitive data into non-sensitive tokens. Security/Privacy
Training Data Dataset used to train machine learning models. Analytics
Transfer Learning Reusing a pre-trained model for a new task. General ML
Transcriptomics Study of RNA transcripts produced by the genome. Bioinformatics
Trusted Execution Environment (TEE) Secure area of a processor for sensitive computations. Security/Privacy
Underfitting Model too simple to capture data patterns. General ML
Unsupervised Learning ML finding patterns in unlabeled data. General ML
Validation Data Dataset for tuning hyperparameters to prevent overfitting. Analytics
Variant Calling Identifying genetic variants from sequence data. Bioinformatics
Vertical Federated Learning FL with different features for the same samples across clients. Federated Learning
Vertically Partitioned Data Data with different columns stored in different locations. Data & Privacy
Zero-Knowledge Proof Proving knowledge of information without revealing it. Security/Privacy

More information

Contributors