Abstract
Intrinsically disordered proteins (IDPs) can adopt a range of conformations from globules to swollen coils. This large range of conformational preferences for different IDPs raises the question of how conformational preferences are encoded by sequence. Global compositional features of a sequence such as the fraction of charged residues and the net charge per residue engender certain conformational biases. However, more specific sequence features such as the patterning of oppositely charged residues, expansion driving residues, or residues that can undergo posttranslational modifications can also influence the conformational ensembles of an IDP. Here, we outline how to calculate important global compositional features and patterning metrics that can be used to classify IDPs into different conformational classes and predict relative changes in conformation for sequences with the same amino acid composition. Although increased effort has been devoted to determining conformational properties of IDPs in recent years, quantitative predictions of conformation directly from sequence remain difficult and often inaccurate. Thus, if quantitative predictions of conformational properties are desired, then sequence-specific simulations must be performed.