The Canonical pre-mRNA 3'-end Processing Project
Updated Jan. 2023
Most eukaryotic messenger RNA precursors (pre-mRNAs) must undergo extensive maturational
processing, including 5'-end capping, splicing, and 3'-end cleavage and polyadenylation. The
addition of a poly(A) tail is important for mRNA stability, and enhances mRNA transport to
the cytoplasm and mRNA translation.
The 3'-end processing events include cleavage at a specific site in the pre-mRNA followed by
the addition of the poly(A) tail. In mammals, the cleavage site is defined by an upstream
and a downstream U or G/U-rich element.
A large number of protein factors have been identified that are crucial for pre-mRNA 3'-end
processing. These proteins form several sub-complexes, such as the cleavage and polyadenylation
specificity factor (CPSF) and the cleavage stimulation factor (CstF). CPSF contains 5 subunits,
CPSF-30, -73, -100, -160, and Fip1, and CstF contains 3 subunits, Cstf-50, -64, and -77. CPSF-160
recognizes the upstream AAUAAA motif, and CstF-64 recognizes the downstream U- or G/U-rich element.
Major findings from this project
- The crystal structures of human CPSF-73, and its weak sequence homolog yeast Ydh1p
(CPSF-100), have been determined.
- The structures of CPSF-73 and CPSF-100 contain an N-terminal metallo-b-lactamase domain
and a novel b-CASP domain. A segment of 60 residues after the b-CASP domain contributes to the
N-terminal metallo-b-lactamase domain.
- CPSF-73 binds two zinc ions, each in an octahedral coordination. A sulfate ion in the structure
is a good mimic of the phosphate group of the pre-mRNA substrate.
- The active site in CPSF-73 is located deep in the interface between the metallo-b-lactamase
domain and the b-CASP domain.
- Despite having a similar overall structure, the zinc ligands are absent in CPSF-100. This
subunit cannot bind zinc and is inactive.
- RNA cleavage assays show that CPSF-73 possesses predominantly non-specific endoribonuclease
- The structural and biochemical studies provide direct experimental evidence that CPSF-73
is the nuclease for the cleavage reaction of pre-mRNA 3'-end processing.
- The crystal structures of the HAT domain of murine CstF-77 as well as its HAT-C subdomain
have been determined.
- The HAT domain contains two subdomains, a HAT-N domain with 5 HAT repeats and a HAT-C domain
with 7 repeats.
- The HAT domain structure is a highly extended dimer, spanning about 165 A. The dimer interface
is extensive, and the residues in the interface are mostly conserved.
- Analytical ultracentrifugation and yeast two-hybrid studies confirm that HAT domain can
dimerize in solution.
- The structural, biochemical and biophysical studies suggest CstF-77 may dimerize during
its function in pre-mRNA 3'-end processing.
The structure of yeast Rna14-Rna15 complex shows a conserved dimeric assocation of the HAT
domain in Rna14.
- The structure of the human symplekin N-terminal domain in complex with Ssu72 and a
Pol II CTD phosphopeptide has been determined at 2.4A resolution.
- Symplekin N-terminal domain contains seven pairs of anti-parallel helices, with
a backbone fold similar to HEAT/Arm repeats.
- Ssu72 is bound to the concave face of symplekin.
- The CTD phosphopeptide is bound with the pSer5-Pro6 peptide bond in the cis configuration,
indicating that Ssu72 can only dephosphorylate the cis configuration of this bond, in
contrast to current hypothesis. Ssu72 is the first phosphatase known to have a specificity
for the cis configuration.
- The active site of Ssu72 is located 25A away from the interface with symplekin. However,
symplekin can stimulate the phosphatase activity of Ssu72. Therefore, symplekin may be more
than just a passive scaffold, but instead may actively regulate the catalysis by Ssu72.
- The N-terminal domain of symplekin inhibits transcription-coupled pre-mRNA 3'-end
processing. Ssu72 can block this inhibition, demonstrating for the first time a role for
mammalian Ssu72 in pre-mRNA 3'-end processing.
- An active site mutant of Ssu72 cannot block this inhibition, suggesting that
the phosphatase activity is required for 3'-end processing.
Ssu72 recognizes pSer7 CTD in the opposite orientation compared to pSer5 phosphopeptide.
Ssu72 has much weaker phosphatase activity toward pSer7 compared to pSer5, based on
phosphatate release assays.
Rtr1 is a novel zinc finger protein, but its structure lacks an active site and it does not
have pSer5 phosphatase activity.
The CTDs of IntS9 and IntS11 have extensive interactions, involving highly conserved residues.
Mutations in the IntS9-IntS11 interface can
block their interactions and Integrator function.
The proper interaction of IntS9 and IntS11 is important for Integrator function.
The structure of mammalian polyadenylation specificty factor (mPSF)
in complex with the AAUAAA PAS RNA has been determined by cryo-EM.
CPSF30 and WDR33 directly contact the PAS, and CPSF160 functions as a
scaffold to position CPSF30 and WDR33 for binding the RNA. The CPSF160-WDR33 complex
has structural homology to the DDB1-DDB2 complex for DNA damage repair.
There is an extensive interface between CPSF160 and WDR33, as well as between
CPSF160 and CPSF30.
ZF2 and ZF3 of CPSF30 each recognizes two nucleotides of the PAS (A1, A2 and A4, A5,
respectively). They share a conserved mode of recognizing A-A dinucleotides.
There is a Hoogsteen base pair between U3 and A6 of the PAS RNA.
mPSF has nanomolar affinity for the PAS RNA. Variations in the PAS RNA, as well as mutations
in CPSF30, greatly reduce the binding affinity.
Three sequence motifs in Sen1 have been identified that interact with the
CID of Nrd1. This interaction does not depend on any phosphorylation, in contact to
the interaction between the CID and the Pol II CTD.
mCF (CPSF73, CPSF100 and symplekin) assumes a trilobal structure, but the three lobes are
highly dynamic relative to each other.
A peptide segment, named the mPSF Interaction Motif (PIM), in the generally
highly hydrophilic and poorly
conserved insert of CPSF100 b-CASP domain is bound by CPSF160 and WDR33, and tether mCF to mPSF.
Mutations or deletion of PIM can abolish the formation of mPSF-mCF complex (CPSF).
The position of mCF relative to mPSF is highly dynamic in the inactive conformation.
CstF77 is bound by CPSF160-WDR33, using a different surface area.
CPSF160 is the central core of the machinery. It recruits WDR33 and CPSF30 to recognize
the AAUAAA PAS. CPSF160-WDR33 recruits mCF and CstF.
Crystal structure of human CPSF30 ZF4-ZF5 in complex with hFip1 segment 159-200 has been
determined at 1.9A resolution.
The complex has 1:2 stoichiometry, with one hFip1 bound to ZF4 and ZF5.
The binding modes of hFip1 to the two ZFs are essentially identical.
Mutations of ZF residue in the interface can block binding to each ZF.
The binding affinity is in the low nM range.
Each hFip1 bound of CPSF30 can recruit one molecule of the catalytic module of PAP.
Each hFip1 binding site in CPSF30 can support AAUAAA-dependent polyadenylation.
Publications from this project
Mandel CR, Kaneko S, Zhang H, Gebauer D, Vethantham V, Manley JL, Tong L. (2006).
Polyadenylation factor CPSF-73 is the pre-mRNA 3'-end-processing endonuclease.
Nature, 444, 953-956.
Bai Y, Auperin TC, Chou C-Y, Chang G-G, Manley JL, Tong L. (2007).
Crystal structure of murine CstF-77: dimeric association and implications for
polyadenylation of mRNA precursors.
Mol. Cell. 25, 863-875.
Mandel CR, Gebauer D, Zhang H, Tong L. (2006).
A serendipitous discovery that in situ proteolysis is essential for
the crystallization of yeast CPSF-100 (Ydh1p).
Acta Cryst. F62, 1041-1045.
Bai Y, Auperin TC, Tong L. (2007).
The use of in situ proteolysis in the crystallization of murine CstF-77.
Acta Cryst. F63, 135-138.
Mandel CR, Tong L. (2007).
How to get all "A"s in polyadenylation.
Structure. 15, 1024-1026.
Mandel CR, Bai Y, Tong L. (2008).
Protein factors in pre-mRNA 3'-end processing
Cell. Mol. Life Sci.. 65, 1099-1122.
K. Xiang, T. Nagaike,* S. Xiang,* T. Kilic, M.M. Beh, J.L. Manley
L. Tong. (2010). Crystal structure of the human symplekin-Ssu72-CTD
Nature. 467, 729-733.
Y. Bai, S.K. Srivastava, J.H. Chang, J.L. Manley L. Tong. (2011).
Structural basis for dimerization and activity of human PAPD1,
a noncanonical poly(A) polymerase.
Mol. Cell. 41, 311-320.
J.H. Chang & L. Tong. (2012). Mitochondrial poly(A) polymerase
and polyadenylation. Biochim. Biophys. Acta, 1819, 992-997.
A.R. Paulson & L. Tong. (2012). Crystal structure of the Rna14-Rna15
complex. RNA, 18, 1154-1162.
K. Xiang, J.L. Manley & L. Tong. (2012). The yeast regulator of
transcription protein Rtr1 lacks an active site and phosphatase
activity. Nature Commun. 3, 946. doi: 10.1038/ncomms1947.
K. Xiang, J.L. Manley & L. Tong. (2012). An unexpected binding
mode for a Pol II CTD peptide phosphorylated at Ser7 in the active
site of the CTD phosphatase Ssu72. Genes Develop. 26, 2265-2270.
W.C. Wilson, H.-T. Hornig-Do, F. Bruni, J.H. Chang, A.A. Jourdain,
J.-C. Martinou, M. Falkenberg, H. Spahr, N.-G. Larsson, R.J. Lewis,
L. Hewitt, A. Basle, H.E. Cross, L. Tong, R.R. Lebel, A.H. Crosby,
Z.M.A. Chrzanowska-Lightowlers* & R.N. Lightowlers.* (2014).
A human mitochondrial poly(A) polymerase mutation reveals the
complexities of post-transcriptional mitochondrial gene expression.
Human Mol. Genet. 23, 6345-6355. (*-co-corresponding authors)
A.R. Jurado, D. Tan, X. Jiao, M. Kiledjian & L. Tong. (2014).
Structure and function of pre-mRNA 5'-end capping quality control
and 3'-end processing.
Biochem. 53, 1882-1898.
K. Xiang, L. Tong & J.L. Manley. (2014).
Delineating the structural blueprint of the pre-mRNA 3' end processing
Mol. Cell. Biol. 34, 1894-1910.
Y. Wu,* T.R. Albrecht,* D. Baillat, E.J. Wagner$ & L. Tong.$ (2017).
Molecular basis for the interaction between Integrator subunits
IntS9 and IntS11 and its functional importance.
Proc. Natl. Acad. Sci. USA, 114, 4394-4399.
(*-equal first authors, $-co-corresponding authors)
Y. Sun,* Y. Zhang,* K. Hamilton, J.L. Manley,$ Y. Shi, T. Walz$
& L. Tong.$ (2018).
Molecular basis for the recognition of the human AAUAAA
Proc. Natl. Acad. Sci. USA, 115, E1419-E1428.
(*-equal first authors, $-co-corresponding authors)
T.R. Albrecht, S.P. Shevtsov, Y. Wu, L.G. Mascibroda, N.J. Peart,
K.-L. Huang, I.A. Sawyer, L. Tong, M. Dundr$ & E.J. Wagner.$ (2018).
Integrator subunit 4 is a 'symplekin-like' scaffold that associates
with INTS9/11 to form the Integrator cleavage module.
Nucl. Acids Res. 46, 4241-4255.
Y. Zhang, Y. Chun, S. Buratowski & L. Tong. (2019).
Identification of three sequence motifs in the transcription
termination factor Sen1 that mediate direct interactions with Nrd1.
Structure, 27, 1156-1161.
K. Hamilton, Y. Sun & L. Tong. (2019).
Biophysical characterizations of the recognition of the AAUAAA
RNA, 25, 1673-1680.
Y. Zhang,* Y. Sun,* Y. Shi, T. Walz$ & L. Tong.$ (2020). Structural
insights into the human pre-mRNA 3'-end processing machinery.
Mol. Cell, 77, 800-809. (Epub 12/3/19) (*-equal first authors,
K. Hamilton & L. Tong. (2020). Molecular mechanism for the
interaction between human CPSF30 and hFip1. Genes Develop. 34, 1753-1761.
Y. Sun, K. Hamilton & L. Tong. (2020). Recent molecular insights
into canonical pre-mRNA 3'-end processing. Transcription, 11, 83-96.
P.A. Gutierrez, J. Wei, Y. Sun & L. Tong. (2022). Molecular basis
for the recognition of the AUUAAA polyadenylation signal by mPSF. RNA, 28, 1534-1541.
Funding for this project
NIH R01GM077175 (2007-2016)
NIH R35GM118093 (2016-)
© copyright 2006-2023, Liang Tong.