11.
AminoWeb: 29 cleaned protein datasets totaling 7.5 TB (x.com)
Protein ML now has a FineWeb-like cleaned dataset bundle covering sequence, structure, and related modalities instead of scattered supplementary tables and FTP mirrors
1 appearance on the backlist front page in the last 30 days.
Protein ML now has a FineWeb-like cleaned dataset bundle covering sequence, structure, and related modalities instead of scattered supplementary tables and FTP mirrors