Skip to main content

Table 7 List of various studies highlighting various strategies for reduced Ab immunogenicity based on big data assessments

From: The rise of big data: deep sequencing-driven computational methods are transforming the landscape of synthetic antibody design

Study

Findings

Limitations

Clavero-Álvarez et al. [77]

The authors presented a computational method based on a multivariate Gaussian analysis that is able to characterize the statistical distribution of the variable sequences from human Abs. Their analysis was performed using large human and murine learning databases, which led to the humanization of various murine sequences

▪ The strategy is dependent on size and quality of the learning databases

▪ Only developed for humanizing murine Abs

▪ Uses stringent threshold scores that reduce the number of potential humanized sequences

Schmitz et al. [78]

A large immunome dataset of 326 million human Ab sequences was used to create a position- and gene-specific scoring matrix. This strategy was used to effectively analyze the human Ab sequence space, allowing for a given input sequence to be compared against associated human Ab features

▪ The scoring is exclusively calculated from V and J gene templated regions

▪ The untemplated CDR-H3 region is not included in the score calculation

▪ The scoring success depends on the chain type and CDR-H3 of certain lengths

Prihoda et al. [80]

A deep-learning methodology was able to predict the most probable human residues given a particular input sequence. This was done by performing exhaustive comparative iterations using NGS datasets of human-derived Ab sequences; this helped determine the prevalence of human-like features from a given input sequence

▪ The model is trained to recognize masked or mutated residues, and repairing them is based on their sequence context

▪ To compare across humanization methods, only average performance results across multiple sequences were used

Wollacott et al. [81]

A LSTM network was capable of learning the specific native features within Ab sequences. To effectively assess the humanness potential of a given Ab sequence, the approach was trained using extensive deep sequencing datasets from naturally occurring Ab repertoires. Ultimately, the model was successful at humanness predictions given random Ab sequences

▪ The model performance is related to the underlying sequence space used in training

▪ The LSTM model favors sequences that are more germline-like

▪ The LSTM model attributes rare sequences as being outliers

Marks et al. [82]

A predictive model uses machine learning classifiers that are trained using deep sequencing big data. It is then used to derive specific mutations given an input sequence to help reduce its immunogenicity potential. The predicted mutations show substantial overlap with those deduced experimentally, proving the methodology as an effective replacement for trial-and-error humanization experiments

▪ The efficiency of predictions reduces when classifying sequences of species it has not been trained on

▪ The model is intended for use on murine precursor sequences

▪ The model is not applicable for the humanization of alternative Ab formats (e.g. nanobodies and asymmetric Abs)