The Challenges

Challenge 1:

In Silico Antibody Affinity Maturation

In Silico Antibody Affinity Maturation

Given the NGS datasets of the sorting outputs of an affinity maturation campaign with diversity in LCDR1-2; LCDR3 and HCDR1-2, as described by Teixeira et al. 2022, design antibody CDRs with improved affinity for the RBD of SARS-CoV-2 that also exhibit favorable developability properties – i.e. do not mutate frameworks.

Register to participate

Example of dataset format (example ONLY; not actual data)

field / column	description [TYPE]	# or range of non-redundant values:
sort_population	indicates population of single domain library where Diversity is inserted [STRING]	phase1_h1_h2_am: 5,959 unique VL+VH phase1_l1_l2_am: 23,320 unique VL+VH phase1_l3_am: 3,589 unique VL+VH
sequence_aa_heavy	amino acid [STRING]	phase1_h1_h2_am: 31,729 phase1_l1_l2_am: 1 phase1_l3_am: 1
sequence_aa_light	amino acid [STRING]	phase1_h1_h2_am: 1 phase1_l1_l2_am: 122,988 phase1_l3_am: 30,379
cdr1_aa_heavy [IMGT]	amino acid [STRING]	phase1_h1_h2_am: 2,528 phase1_l1_l2_am: 1 phase1_l3_am: 1
cdr2_aa_heavy [IMGT]	amino acid [STRING]	phase1_h1_h2_am: 1,023 phase1_l1_l2_am: 1 phase1_l3_am: 1
cdr3_aa_heavy [IMGT]	amino acid [STRING]	phase1_h1_h2_am: 1 phase1_l1_l2_am: 1 phase1_l3_am: 1
cdr1_aa_light [IMGT]	amino acid [STRING]	phase1_h1_h2_am: 1 phase1_l1_l2_am: 4,770 phase1_l3_am: 826
cdr2_aa_light [KABAT]	amino acid [STRING]	phase1_h1_h2_am: 1 phase1_l1_l2_am: 7,047 phase1_l3_am: 784
cdr3_aa_light [IMGT]	amino acid [STRING]	phase1_h1_h2_am: 1 phase1_l1_l2_am: 996 phase1_l3_am: 5,585
redundancy	# of times full chain appears in NGS at the amino acid level [INTEGER]	phase1_h1_h2_am: [1 – 12,099] [min – max] phase1_l1_l2_am: [1 – 494] [min – max] phase1_l3_am: 1,541 [1 – 968] [min – max]

Click here for an example dataset (.csv) for competition 1

Challenge 2:

In Silico Affinity Rank Prediction for Antibody Discovery

In Silico Affinity Rank Prediction

Given the NGS datasets of an HCDR3 clustered selection output, generated from the library described in Teixeira et al., 2021, identify those sequences within the existing NGS dataset that encode the highest affinity antibodies (that have not already had their affinities determined) in three HCDR3 clusters (27F, 28F and 47F).

Register to participate

Example of dataset format (example ONLY; not actual data) – This is for challenge 2 AND challenge 3

field / column	description [TYPE]	# or range of non-redundant values:
characterized	TRUE if characterized by SPR [BOOL]	TRUE # VL+VH: 142 #L3+H3: 74 #H3: 55 #clusters: 40	FALSE # VL+VH: 31,917 #L3+H3: 4,362 #H3: 754 #H3 clusters: 300
lsa_bin	experimentally determined bin group [INTEGER]	values = [1, 2, 5, NA]
cluster_cdr3_heavy	unique identifier (e.g., 57F) [STRING]	300
affinity	LSA affinity in molarity (M) [FLOAT]	2.7×10^-11 – 2.1×10^-3M NA = uncharacterized 1.0×10^-6 M = characterized, weak affinity
on_rate	LSA on-rate in (M^-1s^-1) [FLOAT]	1.2×10² – 4.5×10⁵ M^-1s^-1 NA = uncharacterized OR characterized, slow on-rate
off_rate	LSA off-rate in (s^-1) [FLOAT]	1.0×10^-5 – 0.6×10^-2 s^-1 NA = uncharacterized OR characterized, fast off-rate
sequence_aa_light	amino acid [STRING]	102aa to 120aa
sequence_aa_heavy	amino acid [STRING]	113aa to 133aa
cdr1_aa_heavy [IMGT]	amino acid [STRING]	7aa to 14aa
cdr2_aa_heavy [IMGT]	amino acid [STRING]	6aa to 9aa
cdr3_aa_heavy [IMGT]	amino acid [STRING]	6aa to 26aa
cdr1_aa_light [IMGT]	amino acid [STRING]	5aa to 12aa
cdr2_aa_light [KABAT]	amino acid [STRING]	6aa to 11aa
cdr3_aa_light [IMGT]	amino acid [STRING]	5aa to 18aa
relative_abundance_10nM	relative abundance of the concatenated CDRs in the 10nM RBD sort round via NGS [FLOAT]	0.0% – 6.1%
relative_abundance_1nM	relative abundance of the concatenated CDRs in the 1nM RBD sort round via NGS [FLOAT]	0.0% – 9.5%

Click here for an example dataset (.csv) for competition 2 & 3

Challenge 3:

NGS Inspired Computational Antibody Design

NGS-Inspired Computational Antibody Design

Given the NGS datasets of the same HCDR3 clustered selection output as challenge 2, generate out-of-library antibody sequences, changing only CDRs and leaving frameworks untouched, that bind the same target with the highest affinities, and also exhibit favorable developability properties.

Register to participate

Example of dataset format (example ONLY; not actual data) – This is for challenge 2 AND challenge 3

field / column	description [TYPE]	# or range of non-redundant values:
characterized	TRUE if characterized by SPR [BOOL]	TRUE # VL+VH: 142 #L3+H3: 74 #H3: 55 #clusters: 40	FALSE # VL+VH: 31,917 #L3+H3: 4,362 #H3: 754 #H3 clusters: 300
lsa_bin	experimentally determined bin group [INTEGER]	values = [1, 2, 5, NA]
cluster_cdr3_heavy	unique identifier (e.g., 57F) [STRING]	300
affinity	LSA affinity in molarity (M) [FLOAT]	2.7×10^-11 – 2.1×10^-3M NA = uncharacterized 1.0×10^-6 M = characterized, weak affinity
on_rate	LSA on-rate in (M^-1s^-1) [FLOAT]	1.2×10² – 4.5×10⁵ M^-1s^-1 NA = uncharacterized OR characterized, slow on-rate
off_rate	LSA off-rate in (s^-1) [FLOAT]	1.0×10^-5 – 0.6×10^-2 s^-1 NA = uncharacterized OR characterized, fast off-rate
sequence_aa_light	amino acid [STRING]	102aa to 120aa
sequence_aa_heavy	amino acid [STRING]	113aa to 133aa
cdr1_aa_heavy [IMGT]	amino acid [STRING]	7aa to 14aa
cdr2_aa_heavy [IMGT]	amino acid [STRING]	6aa to 9aa
cdr3_aa_heavy [IMGT]	amino acid [STRING]	6aa to 26aa
cdr1_aa_light [IMGT]	amino acid [STRING]	5aa to 12aa
cdr2_aa_light [KABAT]	amino acid [STRING]	6aa to 11aa
cdr3_aa_light [IMGT]	amino acid [STRING]	5aa to 18aa
relative_abundance_10nM	relative abundance of the concatenated CDRs in the 10nM RBD sort round via NGS [FLOAT]	0.0% – 6.1%
relative_abundance_1nM	relative abundance of the concatenated CDRs in the 1nM RBD sort round via NGS [FLOAT]	0.0% – 9.5%

Click here for an example dataset (.csv) for competition 2 & 3

Understanding AI in Antibody Discovery -