[go: up one dir, main page]

CN116761614A - Engineered uridine phosphorylase variant enzymes - Google Patents

Engineered uridine phosphorylase variant enzymes Download PDF

Info

Publication number
CN116761614A
CN116761614A CN202180084994.8A CN202180084994A CN116761614A CN 116761614 A CN116761614 A CN 116761614A CN 202180084994 A CN202180084994 A CN 202180084994A CN 116761614 A CN116761614 A CN 116761614A
Authority
CN
China
Prior art keywords
seq
engineered
uridine phosphorylase
sequence
polypeptide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180084994.8A
Other languages
Chinese (zh)
Inventor
乔纳森·弗罗姆
杰西卡·安娜·胡尔塔克
安德斯·马修·奈特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Codexis Inc
Original Assignee
Codexis Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Codexis Inc filed Critical Codexis Inc
Priority claimed from PCT/US2021/064161 external-priority patent/WO2022133289A2/en
Publication of CN116761614A publication Critical patent/CN116761614A/en
Pending legal-status Critical Current

Links

Landscapes

  • Enzymes And Modification Thereof (AREA)

Abstract

本发明提供了工程化尿苷磷酸化酶(UP)、具有UP活性的多肽,和编码这些酶的多核苷酸,以及载体和包含这些多核苷酸和多肽的宿主细胞。还提供了用于产生UP酶的方法。本发明还提供了包含UP酶的组合物,以及使用工程化UP酶的方法。本发明尤其可用于药物化合物的产生。The invention provides engineered uridine phosphorylase (UP), polypeptides having UP activity, and polynucleotides encoding these enzymes, as well as vectors and host cells containing these polynucleotides and polypeptides. Methods for producing UP enzymes are also provided. The invention also provides compositions comprising UP enzymes, and methods of using engineered UP enzymes. The invention is particularly useful in the production of pharmaceutical compounds.

Description

工程化尿苷磷酸化酶变体酶Engineered uridine phosphorylase variant enzymes

本申请要求2020年12月18日提交的美国临时专利申请系列第63/127,431号和2021年2月11日提交的美国临时专利申请系列第63/148,324号的优先权,出于所有目的,这两个申请都通过引用以其整体并入。This application claims priority to U.S. Provisional Patent Application Serial No. 63/127,431 filed on December 18, 2020 and U.S. Provisional Patent Application Serial No. 63/148,324 filed on February 11, 2021, both of which are incorporated by reference in their entirety for all purposes.

发明领域Field of the Invention

本发明提供了工程化尿苷磷酸化酶(UP)、具有UP活性的多肽,和编码这些酶的多核苷酸,以及载体和包含这些多核苷酸和多肽的宿主细胞。还提供了用于产生UP酶的方法。本发明还提供了包含UP酶的组合物,以及使用工程化UP酶的方法。本发明尤其可用于药物化合物的产生。对序列表、表格或计算机程序的引用The present invention provides engineered uridine phosphorylase (UP), polypeptides having UP activity, and polynucleotides encoding these enzymes, as well as vectors and host cells comprising these polynucleotides and polypeptides. Also provided are methods for producing UP enzymes. The present invention also provides compositions comprising UP enzymes, and methods of using engineered UP enzymes. The present invention is particularly useful for the production of pharmaceutical compounds. References to sequence listings, tables, or computer programs

序列表的正式副本作为ASCII格式的文本文件经由EFS-Web与说明书同时提交,文件名为“CX2-214WO2_ST25.txt”,创建日期为2021年12月14日,且大小为1.84兆字节。经由EFS-Web提交的序列表为本说明书的一部分并且通过引用以其整体并入本文。An official copy of the sequence listing is submitted with the specification as a text file in ASCII format via EFS-Web, with the file name "CX2-214WO2_ST25.txt", the creation date of December 14, 2021, and the size of 1.84 megabytes. The sequence listing submitted via EFS-Web is part of the present specification and is incorporated herein by reference in its entirety.

发明背景Background of the Invention

越来越多的非天然核苷类似物被研究用于治疗癌症和病毒感染,诸如COVID-19。核苷类似物因为它们与DNA合成中使用的天然核苷的相似性,通常是病毒性疾病的有效抑制剂。类似地,已知核苷类似物刺激抗肿瘤先天免疫应答(该应答被外来核酸的存在激活)。Increasingly, unnatural nucleoside analogs are being investigated for the treatment of cancer and viral infections, such as COVID-19. Nucleoside analogs, because of their similarity to natural nucleosides used in DNA synthesis, are often potent inhibitors of viral diseases. Similarly, nucleoside analogs are known to stimulate anti-tumor innate immune responses (which are activated by the presence of foreign nucleic acids).

然而,通过标准化学合成技术产生核苷类似物可能由于其化学复杂性而具有挑战。另外地,标准化学合成技术通常产生不期望的废弃物。产生核苷类似物所需的底物和中间体对于工业工艺条件可能不可用或不适合。However, the production of nucleoside analogs by standard chemical synthesis techniques may be challenging due to their chemical complexity. Additionally, standard chemical synthesis techniques typically produce undesirable waste. The substrates and intermediates required to produce nucleoside analogs may be unavailable or unsuitable for industrial process conditions.

对于改进的产生用于治疗癌症和病毒感染的非天然核苷类似物的方法存在需求。具体地,在工业工艺条件下合成产生核苷类似物所需的底物和中间体的改进的方法是必要的。一种方法是利用具有改进的性质的工程化多肽来产生底物和中间体。这种生物催化方法具有减少化学废弃物和提高效率的优点。There is demand for the method for non-natural nucleoside analogs that are used to treat cancer and viral infection for improved generation. Specifically, the improved method of synthesizing the required substrate and intermediate of nucleoside analogs under industrial process conditions is necessary. A method is to utilize the engineered polypeptide with improved properties to produce substrate and intermediate. This biocatalytic method has the advantages of reducing chemical waste and improving efficiency.

属于尿苷磷酸化酶类别酶(EC 2.4.2.3)的酶通常催化尿苷和磷酸转化为尿嘧啶和α-D-核糖1-磷酸。然而,尿苷磷酸化酶(UP)也催化逆反应,并且可用于多种转化以合成非天然核苷类似物。需要改进的尿苷磷酸化酶生物催化剂以产生有效产生非天然核苷类似物所必需的中间体和底物。Enzymes belonging to the uridine phosphorylase class of enzymes (EC 2.4.2.3) typically catalyze the conversion of uridine and phosphoric acid into uracil and α-D-ribose 1-phosphate. However, uridine phosphorylase (UP) also catalyzes the reverse reaction and can be used for a variety of conversions to synthesize non-natural nucleoside analogs. Improved uridine phosphorylase biocatalysts are needed to produce the necessary intermediates and substrates for the efficient production of non-natural nucleoside analogs.

发明概述SUMMARY OF THE INVENTION

本发明提供了工程化尿苷磷酸化酶(UP)、具有UP活性的多肽,和编码这些酶的多核苷酸,以及载体和包含这些多核苷酸和多肽的宿主细胞。还提供了用于产生UP酶的方法。本发明还提供了包含UP酶的组合物,以及使用工程化UP酶的方法。本发明尤其可用于药物化合物的产生。The present invention provides engineered uridine phosphorylase (UP), polypeptides having UP activity, and polynucleotides encoding these enzymes, as well as vectors and host cells comprising these polynucleotides and polypeptides. Also provided are methods for producing UP enzymes. The present invention also provides compositions comprising UP enzymes, and methods for using engineered UP enzymes. The present invention is particularly useful for the production of pharmaceutical compounds.

本发明提供了工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶包含与SEQ ID NO:2、SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID NO:868具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列或其功能片段,其中所述工程化尿苷磷酸化酶包含在所述多肽序列中包含至少一个取代或取代集的多肽,并且其中所述多肽序列的氨基酸位置参考SEQ ID NO:2、SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQID NO:868编号。在一些实施方案中,多肽序列与SEQ ID NO:2具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:6、7、9、14、14/38/40/146/147/179/235/236、14/38/86/146/147/235/236/240、14/40、14/40/86/147/193/236/240、14/40/136/179/236/240、14/40/146/235/236、14/40/147/181/193/235、14/40/235、14/86/146、14/146/147/181/240、14/146/236/240、14/147/193/235/236/240、14/179/181/193/235/240、14/235/236、29、31、38/40/86/146/147/179/181、38/40/86/147/236/240、40、40/43/86/146/240、40/43/86/147/235、40/43/146/147、40/43/147/179/236/240、40/43/147/179/240、40/43/147/236/240、40/86/146/235/236、40/86/147/235/236/240、40/86/179/235/240、40/86/235/236/240、40/146/147/240、40/147/240、40/235、40/235/236/240、40/236、40/236/240、42/235/236、43/86/147/181/240、43/146/147/235/236/240、43/146/179/240、43/147、43/147/179/181、47、47/88、64、73、80、86、86/136/146/147/179/181、86/136/146/147/179/235/236、86/147/179/181、86/235、86/235/236/240、86/236/240、86/240、92、97、99、103、103/249、104、105、106、110、146/147、146/147/235/236、146/235/240、146/236/240、146/240、147/179/181、147/235/240/249、157、167、179、179/181、179/181/193/240、181、184、216、226、228、231、233、235、235/236、236、236/240、237、239、240和245,其中所述多肽序列的氨基酸位置参考SEQ ID NO:2编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:2具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:6F、6G、6R、6S、6W、6Y、7T、9M、14A、14A/38L/40S/146E/147L/179R/235R/236A 、14A/38L/86V/146E/147L/235R/236A/240R 、14A/40D/86V/147M/193L/236A/240R、14A/40D/235R、14A/40N、14A/40N/146E/235R/236A、14A/40N/147L/181Q/193L/235R、14A/40S、14A/40S/136V/179R/236A/240R 、 14A/86V/146E 、14A/146E/147L/181Q/240R 、 14A/146E/236A/240R 、14A/147L/193L/235R/236A/240R、14A/179R/181Q/193L/235R/240R、14A/235R/236A、29S、31T、38L/40D/86V/146E/147L/179R/181Q、38L/40S/86V/147L/236A/240R、40D、40D/86V/147L/235R/236A/240R、40D/86V/235R/236A/240R、40D/235R、40N/43F/86V/146E/240R、40N/43F/86V/147L/235R 、 40N/43F/146E/147L 、40N/43F/147L/179R/236A/240R、 40N/86V/179R/235R/240R、40S/43F/147L/179R/240R 、 40S/43F/147L/236A/240R 、40S/86V/146E/235R/236A、 40S/86V/147L/235R/236A/240R、40S/146E/147L/240R、40S/147L/240R、40S/235R/236A/240R、40S/236A、40S/236A/240R、42I/235R/236A、43F/86V/147L/181Q/240R、43F/146E/147L/235R/236A/240R、43F/146E/179R/240R、43F/147L、43F/147L/179R/181Q、47M、47R、47V、47V/88A、64C、73G、80G、80I、80K、80M、80S、80T、86V、86V/136V/146E/147L/179R/181Q、86V/136V/146E/147L/179R/235R/236A、86V/147L/179R/181Q、86V/235R、86V/235R/236A/240R、86V/236A/240R、86V/240R、92V、97T、99E、103G/249V、103S、104T、105M、106E、110A、110C、110S、146E/147L、146E/147L/235R/236A、146E/235R/240R、146E/236A/240R、146E/240R、147L/179R/181Q、147L/235R/240R/249V、157S、167S、179R、179R/181Q、179R/181Q/193L/240R、181Q、184R、184S、216M、226R、228A、228G、228K、228L、228R、231I、231P、231T、231W、231Y、233S、235R、235R/236A、236A、236A/240R、237V、239A、239K、239R、240R和245S,其中所述多肽序列的氨基酸位置参考SEQ ID NO:2编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:2具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:V6F、V6G、V6R、V6S、V6W、V6Y、F7T、L9M、N14A、N14A/M38L/K40S/S146E/I147L/H179R/K235R/Q236A 、N14A/M38L/I86V/S146E/I147L/K235R/Q236A/H240R 、N14A/K40D/I86V/I147M/M193L/Q236A/H240R、N14A/K40D/K235R、N14A/K40N 、 N14A/K40N/S146E/K235R/Q236A 、N14A/K40N/I147L/K181Q/M193L/K235R、N14A/K40S、N14A/K40S/C136V/H179R/Q236A/H240R、N14A/I86V/S146E、N14A/S146E/I147L/K181Q/H240R、N14A/S146E/Q236A/H240R、N14A/I147L/M193L/K235R/Q236A/H240R、N14A/H179R/K181Q/M193L/K235R/H240R、N14A/K235R/Q236A、D29S、V31T 、 M38L/K40D/I86V/S146E/I147L/H179R/K181Q 、M38L/K40S/I86V/I147L/Q236A/H240R 、 K40D 、K40D/I86V/I147L/K235R/Q236A/H240R、K40D/I86V/K235R/Q236A/H240R、K40D/K235R 、 K40N/K43F/I86V/S146E/H240R 、K40N/K43F/I86V/I147L/K235R、 K40N/K43F/S146E/I147L、K40N/K43F/I147L/H179R/Q236A/H240R 、K40N/I86V/H179R/K235R/H240R、K40S/K43F/I147L/H179R/H240R、K40S/K43F/I147L/Q236A/H240R、K40S/I86V/S146E/K235R/Q236A、K40S/I86V/I147L/K235R/Q236A/H240R、K40S/S146E/I147L/H240R、K40S/I147L/H240R、K40S/K235R/Q236A/H240R、K40S/Q236A、K40S/Q236A/H240R、V42I/K235R/Q236A、K43F/I86V/I147L/K181Q/H240R、K43F/S146E/I147L/K235R/Q236A/H240R、K43F/S146E/H179R/H240R、K43F/I147L、K43F/I147L/H179R/K181Q、H47M、H47R、H47V、H47V/T88A、V64C、S73G、E80G、E80I、E80K、E80M、E80S、E80T、I86V、I86V/C136V/S146E/I147L/H179R/K181Q 、I86V/C136V/S146E/I147L/H179R/K235R/Q236A 、I86V/I147L/H179R/K181Q、I86V/K235R、I86V/K235R/Q236A/H240R、I86V/Q236A/H240R、I86V/H240R、I92V、A97T、Q99E、N103G/A249V、N103S、V104T、G105M、D106E、T110A、T110C、T110S、S146E/I147L、S146E/I147L/K235R/Q236A、S146E/K235R/H240R、S146E/Q236A/H240R、S146E/H240R、I147L/H179R/K181Q、I147L/K235R/H240R/A249V、A157S、E167S、H179R、H179R/K181Q、H179R/K181Q/M193L/H240R、K181Q、M184R、M184S、V216M、Q226R、I228A、I228G、I228K、I228L、I228R、A231I、A231P、A231T、A231W、A231Y、T233S、K235R、K235R/Q236A、Q236A、Q236A/H240R、T237V、S239A、S239K、S239R、H240R和V245S,其中所述多肽序列的氨基酸位置参考SEQ ID NO:2编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:2具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:2具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:2具有至少95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。The present invention provides an engineered uridine phosphorylase comprising a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID NO: 868, or a functional fragment thereof, wherein the engineered uridine phosphorylase comprises a polypeptide comprising at least one substitution or set of substitutions in the polypeptide sequence, and wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID NO: 868. In some embodiments, the polypeptide sequence is identical to SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID NO: 868. NO:2 has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 6, 7, 9, 14, 14/38/40 /146/147/179/235/236, 14/38/86/146/147/235/236/240, 14/40, 14/40/86/147/193/236/240, 14/40/136/179/236/240, 14/40/146/235/236, 14 /40/147/181/193/235, 14 /40/235、14/86/146、14/146/147/181/240、14/146/236/240、14/147/193/235/236/240、14/179/181/193/235/240、14/235/236、29、31、38/40/8 6/146/147/179/181, 38/40 /86/147/236/240, 40, 40/43/86/146/240, 40/43/86/147/235, 40/43/146/147, 40/43/147/179/236/240, 40/43/147/179/240, 40/43/147/236/24 0, 40/86/146/235/236, 40 /86/147/235/236/240、40/86/179/235/240、40/86/235/236/240、40/146/147/240、40/147/240、40/235、40/235/236/240、40/236、40/236/240、 42/235/236, 43/86/147/18 1/240, 43/146/147/235/236/240, 43/146/179/240, 43/147, 43/147/179/181, 47, 47/88, 64, 73, 80, 86, 86/136/146/147/179/181, 86/136/146/ 147/179/235/236, 86/147/1 1 46/236/240, 146/240, 147/1 79/181, 147/235/240/249, 157, 167, 179, 179/181, 179/181/193/240, 181, 184, 216, 226, 228, 231, 233, 235, 235/236, 236, 236/240, 237, 239, 240 and 245, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO:2. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:2, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of: 6F, 6G, 6R, 6S, 6W, 6Y, 7T, 9M, 14A, 14A/38L/40S/146E/147L/179R/235R/236A, 14A/38L/86V/146E/147L/235R/236A/240R at one or more positions in the polypeptide sequence. , 14A/40D/86V/147M/193L/236A/240R, 14A/40D/235R, 14A/40N, 14A/40N/146E/235R/236A, 14A/40N/147L/181Q/193L/235R, 14A/40S, 14A/40S/1 36V/179R/236A/240R, 14A/86V/146E, 14A/146E/147L/181Q/240R, 14A/146E/236A/240R , 14A/147L/193L/235R/236A/240R, 14A/179R/181Q/193L/235R/240R, 14A/235R/236A, 29S, 31T, 38L/40D/86V/146E/147L/179R/181Q, 38L/40S/8 6 V/147L/236A/240R, 40D, 40D/86V/147L/235R/236A/240R, 40D/86V/235R/236A/240R, 40D/235R, 40N/43F/86V/146E/240R, 40N/43F/86V/147L/23 5R , 40N/43F/146E/147L, 40N/43F/147L/179R/236A/240R, 40N/86V/179R/235R/240R, 40S/43F/147L/179R/240R, 40S/43F/147L/236A/240R, 40S/86V/146E/235R/236A, 40S/86V/147L/235R/236A/240R, 40S/146E/147L/240R, 40S/147L/240R, 40S/235R/236A/240R, 40S/236A, 40S/236A/240R, 42I/235R/236A, 43F/8 6V/147L/181Q/240R, 43F/146E/147L/235R/236A/240R, 43F/146E/179R/240R, 43F/ 147L, 43F/147L/179R/181Q, 47M, 47R, 47V, 47V/88A, 64C, 73G, 80G, 80I, 80K, 80M, 80S, 80T, 86V, 86V/136V/146E/147L/179R/181Q, 86V/136V/146 E/147L/179R/235R/236A, 86V/147L/179R/181Q, 86V/235R, 86V/235R/236A/240R, 86V 1 46E/236A/240R, 146E/240R, 147L/179R/181Q, 147L/235R/240R/249V, 157S, 167S, 1 79R, 179R/181Q, 179R/181Q/193L/240R, 181Q, 184R, 184S, 216M, 226R, 228A, 228G, 228K, 228L, 228R, 231I, 231P, 231T, 231W, 231Y, 233S, 235R, 235R/236A, 236A, 236A/240R, 237V, 239A, 239K, 239R, 240R and 245S, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO:2. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:2, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of V6F, V6G, V6R, V6S, V6W, V6Y, F7T, L9M, N14A, N14A/M38L/K40S/S146E/I147L/H179R/K235R/Q236A, N14A/M38L/I86V/S146E/I147L/K235R/Q236A/H240R at one or more positions in the polypeptide sequence. , N14A/K40D/I86V/I147M/M193L/Q236A/H240R, N14A/K40D/K235R, N14A/K40N, N14A/K40N/S146E/K235R/Q236A , N14A/K40N/I147L/K181Q/M193L/K235R, N14A/K40S, N14A/K40S/C136V/H179R/Q236A/H240R, N14A/I86V/S146E, N14A/S146E/I147L/K181Q/H24 0R, N14A/S146E/Q236A/H240R, N14A/I147L/M193L/K235R/Q236A/H240R, N14A/H179R/K181Q/M193L/K235R/H240R, N14A/K235R/Q236A, D29S, V31T , M38L/K40D/I86V/S146E/I147L/H179R/K181Q, M38L/K40S/I86V/I147L/Q236A/H240R, K40D, K40D/I86V/I147L/K235R/Q236A/H240R, K40D/I86V /K235R/Q236A/H240R, K40D/K235R, K40N/K43F/I86V/S146E/H240R, K40N/K43F/I86V/I147L/K235R, K40N/K43F/S146E/I147L, K40N/K43F/I147L/H179R/Q236A/H240R , K40N/I86V/H179R/K235R/H240R, K40S/K43F/I147L/H179R/H240R, K40S/K43F/I147L/Q236A/H240R, K40S/I86V/S146E/K235R/Q236A, K40S/I86 V/I147L/K235R/Q236A/H240R, K40S/S146E/I147L/H240R, K40S/I147L/H240R, K40S/K235R/Q236A/H240R, K40S/Q236A, K40S/Q236A/H240R, V 42I/K235R/Q236A, K43F/I86V/I147L/K181Q/H240R, K43F/S146E/I147L/K235R/Q236A/H240R, K43F/S146E/H179R/H240R, K43F/I147L, K43F/I14 7L/H179R/K181Q, H47M, H47R, H47V, H47V/T88A, V64C, S73G, E80G, E80I, E80K, E80M, E80S, E80T, I86V, I86V/C136V/S146E/I147L/H179R/K181Q , I86V/C136V/S146E/I147L/H179R/K235R/Q236A, I86V/I147L/H179R/K181Q, I86V/K235R, I86V/K235R/Q236A/H240R, I86V/Q236A/H240R, I86 V/H240R, I92V, A97T, Q99E, N103G/A249V, N103S, V104T, G105M , D106E, T110A, T110C, T110S, S146E/I147L, S146E/I147L/K235R/Q236A, S146E/K235R/H240R, S146E/Q236A/H240R, S146E/H240R, I147L/H179R/K 181Q、I147 L/K235R/H240R/A249V, A157S, E167S, H179R, H179R/K181Q, H179R/K181Q/M193L/H240R, K181Q, M184R, M184S, V216M, Q226R, I228A, I228G, I228K, I228L, I2 28R, A231I, A231P, A231T, A231W, A231Y, T233S, K235R, K235R/Q236A, Q236A, Q236A/H240R, T237V, S239A, S239K, S239R, H240R, and V245S, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO: 2. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 2. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 2. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 2.

在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:4具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:3、3/9/216、3/9/216/236、3/9/235/237、3/31/47/179/181/216、3/31/47/179/181/237、3/31/47/179/216、3/31/47/179/216/237、3/31/179、3/31/179/181、3/31/179/181/237、3/31/179/216、3/31/181/216、3/31/181/237、3/47/179/181/216、3/47/179/216/237、3/47/181、3/179、3/179/181、3/179/181/216、3/179/181/237、3/179/216、3/179/216/237、3/179/237、3/181/216、3/181/216/237、3/216/236/240、9/216/236/237、9/237、13、24、31、31/47、31/47/179/181/237、31/47/179/216、31/47/181、31/47/216、31/179、31/179/181、31/179/216、31/181、31/181/216、31/181/216/237、31/181/237、31/216、31/216/237、31/236/237/240、31/237、33、46、47、47/147/181/231、47/179/181、47/179/181/216、47/179/184、47/179/216、47/181/216、47/181/216/237、47/181/231、47/216、52、63、67、83、87/160、92、95、97、99、100、101、105、106、108、111、137、151、152、155、159、160、170、173、177、179、179/181、179/181/216、179/181/216/237、179/181/231、179/181/241、179/216、179/228/231、179/237、181、181/216、181/216/237、181/237、183、185、188、189、191、201、216、216/237、218、222、228、231、233、235、235/237、236、236/237/240、237、238、240、241和248,其中所述多肽序列的氨基酸位置参考SEQ ID NO:4编号。在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:4具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:3E、3E/9M/216M、3E/9M/216M/236A、 3E/9M/235R/237V、3E/31T/47V/179R/181Q/216M、 3E/31T/47V/179R/181Q/237V、3E/31T/47V/179R/216M、3E/31T/47V/179R/216M/237V、3E/31T/179R、3E/31T/179R/181Q、3E/31T/179R/181Q/237V、3E/31T/179R/216M、3E/31T/181Q/216M、3E/31T/181Q/237V、3E/47V/179R/181Q/216M、3E/47V/179R/216M/237V、3E/47V/181Q、3E/179R、3E/179R/181Q、3E/179R/181Q/216M、3E/179R/181Q/237V、3E/179R/216M、3E/179R/216M/237V、3E/179R/237V、3E/181Q/216M、3E/181Q/216M/237V、3E/216M/236A/240R、9M/216M/236A/237V、9M/237V、13L、24L、24M、24Q、31T、31T/47V、31T/47V/179R/181Q/237V、31T/47V/179R/216M、31T/47V/181Q、31T/47V/216M、31T/179R、31T/179R/181Q、31T/179R/216M、31T/181Q、31T/181Q/216M、31T/181Q/216M/237V、31T/181Q/237V、31T/216M、31T/216M/237V、31T/236A/237V/240R、31T/237V、33L、33N、33R、33V、33Y、46Q、47V、47V/147L/181Q/231I、47V/179R/181Q、47V/179R/181Q/216M、47V/179R/184R、47V/179R/216M、47V/181Q/216M、47V/181Q/216M/237V、47V/181Q/231W、47V/216M、47W、52L、52W、63M、67G、83G、87H/160C、92M、95S、97S、99L、99R、99V、100D、100E、100G、100R、100T、101A、101T、101V、101W、105S、106E、108A、108M、108V、111K、111R、137G、151S、152L、152S、152V、155R、159A、159T、160G、170A、173R、177R、179R、179R/181Q、179R/181Q/216M、179R/181Q/216M/237V、179R/181Q/231W、179R/181Q/241V、179R/216M、179R/228A/231I、179R/237V、181Q、181Q/216M、181Q/216M/237V、181Q/237V、183L、183T、183W、185A、185G、185H、185K、185Q、185R、185S、185V、185W、188A、188L、189L、189R、189V、191F、191R、201L、216M、216M/237V、218A、218V、222L、222R、228H、228L、228T、231G、231V、233A、233G、233N、233S、235H、235R/237V、235S、236A、236A/237V/240R、236P、236S、236T、237V、238G、238P、238S、240G、241G、241L、241M、241P、241W和248T,其中所述多肽序列的氨基酸位置参考SEQ ID NO:4编号。在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:4具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:K3E、K3E/L9M/V216M、K3E/L9M/V216M/Q236A、K3E/L9M/K235R/T237V、K3E/V31T/H47V/H179R/K181Q/V216M、K3E/V31T/H47V/H179R/K181Q/T237V、K3E/V31T/H47V/H179R/V216M、K3E/V31T/H47V/H179R/V216M/T237V、 K3E/V31T/H179R、K3E/V31T/H179R/K181Q、 K3E/V31T/H179R/K181Q/T237V、K3E/V31T/H179R/V216M 、 K3E/V31T/K181Q/V216M 、K3E/V31T/K181Q/T237V、 K3E/H47V/H179R/K181Q/V216M、K3E/H47V/H179R/V216M/T237V、K3E/H47V/K181Q、K3E/H179R、K3E/H179R/K181Q 、 K3E/H179R/K181Q/V216M 、K3E/H179R/K181Q/T237V 、 K3E/H179R/V216M、K3E/H179R/V216M/T237V、K3E/H179R/T237V、K3E/K181Q/V216M、K3E/K181Q/V216M/T237V、K3E/V216M/Q236A/H240R、L9M/V216M/Q236A/T237V、L9M/T237V、K13L、V24L、V24M、V24Q、V31T、V31T/H47V、V31T/H47V/H179R/K181Q/T237V、V31T/H47V/H179R/V216M、V31T/H47V/K181Q、V31T/H47V/V216M、V31T/H179R、V31T/H179R/K181Q、V31T/H179R/V216M、V31T/K181Q、V31T/K181Q/V216M、V31T/K181Q/V216M/T237V、V31T/K181Q/T237V、V31T/V216M、V31T/V216M/T237V、V31T/Q236A/T237V/H240R、V31T/T237V、K33L、K33N、K33R、K33V、K33Y、S46Q、H47V、H47V/I147L/K181Q/A231I、H47V/H179R/K181Q、H47V/H179R/K181Q/V216M、H47V/H179R/M184R、H47V/H179R/V216M、H47V/K181Q/V216M、H47V/K181Q/V216M/T237V、H47V/K181Q/A231W、H47V/V216M、H47W、T52L、T52W、I63M、T67G、Q83G、R87H/D160C、I92M、T95S、A97S、Q99L、Q99R、Q99V、P100D、P100E、P100G、P100R、P100T、H101A、H101T、H101V、H101W、G105S、D106E、L108A、L108M、L108V、T111K、T111R、T137G、T151S、H152L、H152S、H152V、V155R、S159A、S159T、D160G、D170A、S173R、V177R、H179R、H179R/K181Q、H179R/K181Q/V216M、H179R/K181Q/V216M/T237V、H179R/K181Q/A231W、H179R/K181Q/A241V、H179R/V216M、H179R/I228A/A231I、H179R/T237V、K181Q、K181Q/V216M、K181Q/V216M/T237V、K181Q/T237V、S183L、S183T、S183W、E185A、E185G、E185H、E185K、E185Q、E185R、E185S、E185V、E185W、Q188A、Q188L、A189L、A189R、A189V、G191F、G191R、T201L、V216M、V216M/T237V、G218A、G218V、N222L、N222R、I228H、I228L、I228T、A231G、A231V、T233A、T233G、T233N、T233S、K235H、K235R/T237V、K235S、Q236A、Q236A/T237V/H240R、Q236P、Q236S、Q236T、T237V、E238G、E238P、E238S、H240G、A241G、A241L、A241M、A241P、A241W和A248T,其中所述多肽序列的氨基酸位置参考SEQ ID NO:4编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:4具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:4具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:4具有至少95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。In some embodiments, the present invention provides an engineered uridine phosphorylase having a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:4, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 3, 3/9/2 16. 3/9/216/236, 3/9/235/237, 3/31/47/179/181/216, 3/31/47/179/181/237, 3/31/47/179/216, 3/31/47/179/216/237, 3/31/179, 3/31/179/1 81.3/31/179/181/237, 3/31/179/216, 3/31/181/216, 3/31/181/237, 3/47/179/181/216, 3/47/179/216/237, 3/47/181, 3/179, 3/179/181, 3/179/181/216, 3/179/181 /237、3/179/216、3/179/ 216/237, 3/179/237, 3/181/216, 3/181/216/237, 3/216/236/240, 9/216/236/237, 9/237, 13, 24, 31, 31/47, 31/47/179/181/237, 31/47/179/216 ,31/47/181,31/47/216 , 31/179, 31/179/181, 31/179/216, 31/181, 31/181/216, 31/181/216/237, 31/181/237, 31/216, 31/216/237, 31/236/237/240, 31/237, 33, 46, 4 7. 47/147/181/231, 47/1 10 0, 101, 105, 106, 108, 111 ,137,151,152,155,159,160,170,173,177,179,179/181,179/181/216,179/181/216/237,179/181/231,179/181/241,179/216,179/228/231, 179/237, 181, 181/216, 1 81/216/237, 181/237, 183, 185, 188, 189, 191, 201, 216, 216/237, 218, 222, 228, 231, 233, 235, 235/237, 236, 236/237/240, 237, 238, 240, 241 and 248, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO:4. In some embodiments, the present invention provides an engineered uridine phosphorylase having a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:4, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of: 3E, 3E/9M/216M, 3E/9M/216M/236A, 3E/9M/235R/237V, 3E/31T/47V/179R/181Q/216M, 3E/31T/47V/179R/181Q/237V, 3E/31T/47V/179R/216M, 3E/31T/47V/179R/216M/237V, 3E/31T/179R, 3E/31T/179R/181Q, 3E/31T/179R/181Q/237 V, 3E/31T/179R/216M, 3E/31T/181Q/216M, 3E/31T/181Q/237V, 3E/47V/1 79R/181Q/216M, 3E/47V/179R/216M/237V, 3E/47V/181Q, 3E/179R, 3E/179R/181Q, 3E/179R/181Q/216M, 3E/179R/181Q/237V, 3E/179R/216M, 3E/ 179R/216M/237V, 3E/179R/237V, 3E/181Q/216M, 3E/181Q/216M/237V, 3E/2 16M/236A/240R, 9M/216M/236A/237V, 9M/237V, 13L, 24L, 24M, 24Q, 31T, 31T/47V, 31T/47V/179R/181Q/237V, 31T/47V/179R/216M, 31T/47V/181Q , 31T/47V/216M, 31T/179R, 31T/179R/181Q, 31T/179R/216M, 31T/181Q, 31T /181Q/216M, 31T/181Q/216M/237V, 31T/181Q/237V, 31T/216M, 31T/216M/237V, 31T/236A/237V/240R, 31T/237V, 33L, 33N, 33R, 33V, 33Y, 46Q, 47V , 47V/147L/181Q/231I, 47V/179R/181Q, 47V/179R/181Q/216M, 47V/179R/ 184R, 47V/179R/216M, 47V/181Q/216M, 47V/181Q/216M/237V, 47V/181Q/231W, 47V/216M, 47W, 52L, 52W, 63M, 67G, 83G, 87H/160C, 92M, 95S, 97S, 9 9L, 99R, 99V, 100D, 100E, 100G, 100R, 100T, 101A, 101T, 101V, 101W, 105S, 10 179 R/181Q/216M/237V, 179R/181Q/231W, 179R/181Q/241V, 179R/216M, 179R/2 28A/231I, 179R/237V, 181Q, 181Q/216M, 181Q/216M/237V, 181Q/237V, 183L, 183T, 183W, 185A, 185G, 185H, 185K, 185Q, 185R, 185S, 185V, 185W, 188 A. 188L, 189L, 189R, 189V, 191F, 191R, 201L, 216M, 216M/237V, 218A, 218V, 222L, 222R, 228H, 228L, 228T, 231G, 231V, 233A, 233G, 233N, 233S, 235H, 235R/237V, 235S, 236A, 236A/237V/240R, 236P, 236S, 236T, 237V, 238G, 238P, 238S, 240G, 241G, 241L, 241M, 241P, 241W and 248T, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO: 4. In some embodiments, the present invention provides an engineered uridine phosphorylase having the same amino acid position as SEQ ID NO: 4. NO:4 has a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of K3E, K3E/L9M/V216M, K3E/ L9M/V216M/Q236A, K3E/L9M/K235R/T237V, K3E/V31T/H47V/H179R/K181Q/V216M, K3E/V31T/H47V/H179R/K181Q/T237V, K3E/V31T/H47V/H179R/V2 16M, K3E/V31T/H47V/H179R/V216M/T237V, K3E/V31T/H179R, K3E/V31T/H179R/K181Q, K3E/V31T/H179R/K181Q/T237V, K3E/V31T/H179R/V216M, K3E/V31T/K181Q/V216M, K3E/V31T/K181Q/T 237V, K3E/H47V/H179R/K181Q/V216M, K3E/H47V/H179R/V216M/T237V, K3E/H47V/K181Q, K3E/H179R, K3E/H179R/K181Q, K3E/H179R/K181Q/V216M , K3E/H179R/K181Q/T237V, K3E/H179R/V216M, K3E/H179R/V216M/T237V, K3E/H179R/T237V, K3E/K181Q/V216M, K3E/K181Q/V216M/T237V, K3E/ V216M/Q236A/H240R, L9M/V216M/Q236A/T237V, L9M/T237V, K13L, V24L, V24M, V24Q, V31T, V31T/H4 7V, V31T/H47V/H179R/K181Q/T237V, V31T/H47V/H179R/V216M, V31T/H47V/K181Q, V31T/H47V/V216M, V31T/H179R, V31T/H179R/K181Q, V31T/H179 R/V216M, V31T/K181Q, V31T/K181Q/V216M, V31T/K181Q/V216M/T237 V, V31T/K181Q/T237V, V31T/V216M, V31T/V216M/T237V, V31T/Q236A/T237V/H240R, V31T/T237V, K33L, K33N, K33R, K33V, K33Y, S46Q, H47V, H47V/I1 47L/K181Q/A231I, H47V/H179R/K181Q, H47V/H179R/K181Q/V216M, H47V/H179R/M184R, H47V/H179R/V216M, H47V/K181Q/V216M, H47V/K181Q/V216M/T237V, H47V/K181Q/A231W, H47V/V216M, H47W, T52L, T52W, I63M, T 67G, Q83G, R87H/D160C, I92M, T95S, A97S, Q99L, Q99R, Q99V, P100D, P100E, P100G, P100R, P100T, H101A, H101T, H101V, H101W, G105S, D106E, L108A, L108M, L108V, T111K, T111R, T137G, T151S, H152L, H152S, H152V, V1 55R, S159A, S159T, D160G, D170A, S173R, V177R, H179R, H179R/K181 Q. H179R/K181Q/V216M, H179R/K181Q/V216M/T237V, H179R/K181Q/A231W, H179R/K181Q/A241V, H179R/V216M, H179R/I228A/A231I, H179R/T237V, K181Q, K181Q/V216M, K181Q/V216M/T237V, K181Q/T237V, S183L, S18 3T, S183W, E185A, E185G, E185H, E185K, E185Q, E185R, E185S, E185V, E185W, Q188A, Q188L, A189L, A189R, A189V, G191F, G191R, T201L, V216M, V216 M/T237V, G218A, G218V, N222L, N222R, I228H, I228L, I228T, A231G, A2 31V, T233A, T233G, T233N, T233S, K235H, K235R/T237V, K235S, Q236A, Q236A/T237V/H240R, Q236P, Q236S, Q236T, T237V, E238G, E238P, E238S, H240G, A241G, A241L, A241M, A241P, A241W and A248T, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO:4. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 4. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 4. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 4.

在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:246具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:3/24/33/47/100/183/185、3/24/33/47/100/216/228/233、3/24/33/100/183/185/228、3/24/33/108、3/24/47/100、3/24/47/100/108/111/160/185/233/241、3/24/47/108/160/241、3/24/47/160/189、3/24/47/189/228/233、3/24/47/228、3/24/47/228/233、3/24/95/100、3/24/95/100/160/189/228/241、3/24/100、3/24/100/160/218/241、3/24/111、3/24/111/183/228/233/241、3/24/111/228/233、3/24/183/185/216、3/24/189/233、3/33/47/95/100/241、3/33/47/100/108/189/216/228/233、3/33/47/100/111/228、3/33/47/100/111/233/241、3/33/47/100/216、3/33/47/108/111/233、3/33/47/108/189/233/241、3/33/160/233、3/47、3/47/95/100/108/189/233、3/47/95/100/111/241、3/47/95/160/189、3/47/100/108/183/185/189/241、3/47/100/160/185、3/47/100/185/189/228、3/47/108/111、3/47/183/189/228/233、3/47/189、3/47/228/233、3/95/100/160/228/233、3/95/100/183/216/228/233、3/95/100/183/233、3/95/185/189/216、3/95/189、3/95/233、3/160、3/183/185/189/228/233、3/183/189/228/233、3/185、3/185/189、3/189、24、24/33/47、24/33/47/228/241、24/33/100/108/241、24/47/95/100、24/47/95/100/160/228/233/241、24/47/185/216/218、24/47/216、24/95/183、24/100/160/233、24/160/183/185、24/189/228/233、33/47/95/100/233、33/47/95/100/233/241、33/47/160、33/47/233、33/100/183/185、33/100/185/233、47、47/100/111/233、47/100/189、47/100/189/233、47/108/160/228/241、47/111、47/160/185/189/233、47/228/233、95/100/183、95/100/189、95/100/228、95/100/228/233、95/100/233、100/160/185、100/228/233、108、108/183/189/233、108/185/216/228/233、160/233、228和228/233,其中所述多肽序列的氨基酸位置参考SEQ ID NO:246编号。在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:246具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:3E/24L/33Y/47V/100E/216M/228L/233S、3E/24L/33Y/47W/100E/183L/185A、3E/24L/33Y/100E/183L/185A/228L、3E/24L/33Y/108V、3E/24L/47V/228L、3E/24L/47V/228L/233S 、 3E/24L/47W/100E 、3E/24L/47W/100E/108V/111K/160G/185A/233S/241A 、3E/24L/47W/108V/160G/241A 、 3E/24L/47W/160G/189R 、3E/24L/47W/189R/228L/233S 、 3E/24L/95S/100E 、3E/24L/95S/100E/160G/189R/228L/241A 、 3E/24L/100E 、3E/24L/100E/160G/218V/241A 、 3E/24L/111K 、3E/24L/111K/183L/228L/233S/241A、 3E/24L/111K/228L/233S、3E/24L/183L/185A/216M、3E/24L/189R/233S、3E/33Y/47V/100E/111K/228L、3E/33Y/47V/100E/111K/233S/241A、 3E/33Y/47V/100E/216M、3E/33Y/47W/95S/100E/241A 、3E/33Y/47W/100E/108V/189R/216M/228L/233S 、3E/33Y/47W/108V/111K/233S、3E/33Y/47W/108V/189R/233S/241A、3E/33Y/160G/233S、3E/47V/95S/100E/111K/241A、3E/47V/95S/160G/189R、3E/47V/100E/160G/185A、3E/47W、3E/47W/95S/100E/108V/189R/233S、3E/47W/100E/108V/183L/185A/189R/241A、3E/47W/100E/185A/189R/228L、3E/47W/108V/111K、3E/47W/183L/189R/228L/233S、3E/47W/189R、3E/47W/228L/233S 、 3E/95S/100E/160G/228L/233S 、3E/95S/100E/183L/216M/228L/233S、3E/95S/100E/183L/233S、3E/95S/185A/189R/216M、3E/95S/189R、3E/95S/233S、3E/160G、3E/183L/185A/189R/228L/233S、3E/183L/189R/228L/233S、3E/185A、3E/185A/189R、3E/189R、24L、24L/33Y/47W、24L/33Y/47W/228L/241A、24L/33Y/100E/108V/241A、24L/47V/95S/100E、24L/47V/95S/100E/160G/228L/233S/241A、24L/47V/185A/216M/218V、24L/47V/216M、24L/95S/183L、24L/100E/160G/233S、24L/160G/183L/185A、24L/189R/228L/233S、33Y/47V/95S/100E/233S、33Y/47W/95S/100E/233S/241A、33Y/47W/160G、33Y/47W/233S、33Y/100E/183L/185A、33Y/100E/185A/233S、47V、47V/100E/111K/233S、47V/100E/189R、47V/108V/160G/228L/241A、47V/111K、47V/160G/185A/189R/233S、47V/228L/233S、47W/100E/189R/233S、95S/100E/183L、95S/100E/189R、95S/100E/228L、95S/100E/228L/233S、95S/100E/233S、100E/160G/185A、100E/228L/233S、108V、108V/183L/189R/233S、108V/185A/216M/228L/233S、160G/233S、228L和228L/233S,其中所述多肽序列的氨基酸位置参考SEQ ID NO:246编号。在一些实施方案中,本发明提供了一种工程化尿苷磷酸化酶,所述工程化尿苷磷酸化酶具有与SEQ ID NO:246具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列,并且其中所述工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集: K3E/V24L/K33Y/H47V/P100E/V216M/I228L/T233S、K3E/V24L/K33Y/H47W/P100E/S183L/E185A 、K3E/V24L/K33Y/P100E/S183L/E185A/I228L、K3E/V24L/K33Y/L108V、K3E/V24L/H47V/I228L 、 K3E/V24L/H47V/I228L/T233S 、K3E/V24L/H47W/P100E 、K3E/V24L/H47W/P100E/L108V/T111K/D160G/E185A/T233S/M241A、K3E/V24L/H47W/L108V/D160G/M241A、K3E/V24L/H47W/D160G/A189R、K3E/V24L/H47W/A189R/I228L/T233S、K3E/V24L/T95S/P100E、K3E/V24L/T95S/P100E/D160G/A189R/I228L/M241A、K3E/V24L/P100E、K3E/V24L/P100E/D160G/G218V/M241A、 K3E/V24L/T111K、K3E/V24L/T111K/S183L/I228L/T233S/M241A 、K3E/V24L/T111K/I228L/T233S、K3E/V24L/S183L/E185A/V216M、K3E/V24L/A189R/T233S、K3E/K33Y/H47V/P100E/T111K/I228L、K3E/K33Y/H47V/P100E/T111K/T233S/M241A、K3E/K33Y/H47V/P100E/V216M、K3E/K33Y/H47W/T95S/P100E/M241A、K3E/K33Y/H47W/P100E/L108V/A189R/V216M/I228L/T233S 、K3E/K33Y/H47W/L108V/T111K/T233S 、K3E/K33Y/H47W/L108V/A189R/T233S/M241A、K3E/K33Y/D160G/T233S、K3E/H47V/T95S/P100E/T111K/M241A、K3E/H47V/T95S/D160G/A189R、K3E/H47V/P100E/D160G/E185A 、 K3E/H47W 、K3E/H47W/T95S/P100E/L108V/A189R/T233S 、K3E/H47W/P100E/L108V/S183L/E185A/A189R/M241A 、K3E/H47W/P100E/E185A/A189R/I228L、K3E/H47W/L108V/T111K、K3E/H47W/S183L/A189R/I228L/T233S、K3E/H47W/A189R、K3E/H47W/I228L/T233S、K3E/T95S/P100E/D160G/I228L/T233S、K3E/T95S/P100E/S183L/V216M/I228L/T233S、K3E/T95S/P100E/S183L/T233S、K3E/T95S/E185A/A189R/V216M、K3E/T95S/A189R、K3E/T95S/T233S、K3E/D160G、K3E/S183L/E185A/A189R/I228L/T233S、K3E/S183L/A189R/I228L/T233S、K3E/E185A、K3E/E185A/A189R、K3E/A189R、V24L、V24L/K33Y/H47W、V24L/K33Y/H47W/I228L/M241A、V24L/K33Y/P100E/L108V/M241A、V24L/H47V/T95S/P100E 、V24L/H47V/T95S/P100E/D160G/I228L/T233S/M241A 、V24L/H47V/E185A/V216M/G218V、V24L/H47V/V216M、V24L/T95S/S183L、V24L/P100E/D160G/T233S 、 V24L/D160G/S183L/E185A 、V24L/A189R/I228L/T233S、 K33Y/H47V/T95S/P100E/T233S、K33Y/H47W/T95S/P100E/T233S/M241A、K33Y/H47W/D160G、K33Y/H47W/T233S、K33Y/P100E/S183L/E185A、K33Y/P100E/E185A/T233S、H47V、H47V/P100E/T111K/T233S、H47V/P100E/A189R、H47V/L108V/D160G/I228L/M241A、H47V/T111K、H47V/D160G/E185A/A189R/T233S、H47V/I228L/T233S、H47W/P100E/A189R/T233S、T95S/P100E/S183L、T95S/P100E/A189R、T95S/P100E/I228L、T95S/P100E/I228L/T233S、T95S/P100E/T233S、P100E/D160G/E185A、P100E/I228L/T233S、L108V、L108V/S183L/A189R/T233S、L108V/E185A/V216M/I228L/T233S、D160G/T233S、I228L和I228L/T233S,其中所述多肽序列的氨基酸位置参考SEQ ID NO:246编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:246具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:246具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:246具有至少95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。In some embodiments, the invention provides an engineered uridine phosphorylase having a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 246, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 3/24/33/47/100/183/185, 3/24/33/47/100/216/228/233, 3/24/33/47/100/217/229/230, 3/24/33/47/100/228/231. 3/100/183/185/228, 3/24/33/108, 3/24/47/100, 3/24/47/100/108/111/160/185/233/241, 3/24/47/108/160/241, 3/24/47/160/189, 3/24/47 /189/228/233, 3/24/47/228, 3/24/47/228/233, 3/24/95/100, 3/24/95/100/160/189/ 228/241, 3/24/100, 3/24/100/160/218/241, 3/24/111, 3/24/111/183/228/233/241, 3/24/111/228/233, 3/24/183/185/216, 3/24/189/233, 3/3 3/47/95/100/241, 3/33/47/100/108/189/216/228/233, 3/33/47/100/111/228, 3/33 /47/100/111/233/241、3/33/47/100/216、3/33/47/108/111/233、3/33/47/108/189/233/241、3/33/160/233、3/47、3/47/95/100/108/189/233、 3/47/95/100/111/241, 3/47/95/160/189, 3/47/100/108/183/185/189/241, 3/47/10 0/160/185, 3/47/100/185/189/228, 3/47/108/111, 3/47/183/189/228/233, 3/47/189, 3/47/228/233, 3/95/100/160/228/233, 3/95/100/183/ 216/228/233, 3/95/100/183/233, 3/95/185/189/216, 3/95/189, 3/95/233, 3/160, 3/ 183/185/189/228/233, 3/183/189/228/233, 3/185, 3/185/189, 3/189, 24, 24/33/47, 24/33/47/228/241, 24/33/100/108/241, 24/47/95/100, 2 4/47/95/100/160/228/233/241, 24/47/185/216/218, 24/47/216, 24/95/183, 24/100/ 160/233, 24/160/183/185, 24/189/228/233, 33/47/95/100/233, 33/47/95/100/233/241, 33/47/160, 33/47/233, 33/100/183/185, 33/100/185 /233, 47, 47/100/111/233, 47/100/189, 47/100/189/233, 47/108/160/228/241, 47/11 1, 47/160/185/189/233, 47/228/233, 95/100/183, 95/100/189, 95/100/228, 95/100/228/233, 95/100/233, 100/160/185, 100/228/233, 108, 108/183/189/233, 108/185/216/228/233, 160/233, 228 and 228/233, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO: 246. In some embodiments, the invention provides an engineered uridine phosphorylase having the same amino acid position as SEQ ID NO: 246. NO:246 has a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of: 3E/24 L/33Y/47V/100E/216M/228L/233S, 3E/24L/33Y/47W/100E/183L/185A, 3E/24L/33Y/100E/183L/185A/228L, 3E/24L/33Y/108V, 3E/24L/47V/228L ,3E/24L/47V/228L/233S , 3E/24L/47W/100E, 3E/24L/47W/100E/108V/111K/160G/185A/233S/241A, 3E/24L/47W/108V/160G/241A, 3E/24L/47W/160G/189R, 3E/24L /47W/189R/228L/233S, 3E/24L/95S/100E, 3E/24L/95S/100E/160G/189R/228L/241A, 3E/24L/100E, 3E/24L/100E/160G/218V/241A, 3E/24L/111K, 3E/24L/111K/183L/228L/233S/241A, 3E/24L/111K/228L/233S, 3E/24L/183L/185A/216M, 3E/24L/189R/233S, 3E/33Y/47V/100E /111K/228L, 3E/33Y/47V/100E/111K/233S/241A, 3E/33Y/47V/100E/216M, 3E/33Y/47W/95S/100E/241A , 3E/33Y/47W/100E/108V/189R/216M/228L/233S, 3E/33Y/47W/108V/111K/233S, 3E/33Y/47W/108V/189R/233S/241A, 3E/33Y/160G/233S, 3E/4 7V/95S/100E/111K/241A, 3E/47V/95S/160G/189R, 3E/47V/100E/160G/185A, 3E/47W, 3E/47W/9 5S/100E/108V/189R/233S, 3E/47W/100E/108V/183L/185A/189R/241A, 3E/47W/100E/185A/189R/228L, 3E/47W/108V/111K, 3E/47W/183L/189R/ 228L/233S, 3E/47W/189R, 3E/47W/228L/233S, 3E/95S/100E/160G/228L/233S , 3E/95S/100E/183L/216M/228L/233S, 3E/95S/100E/183L/233S, 3E/95S/185A/189R/216M, 3E/95S/189R, 3E/95S/233S, 3E/160G, 3E/183L/185A /189R/228L/233S, 3E/183L/189R/228L/233S, 3E/185A, 3E/185A/189R, 3E/189R, 24L, 24L/33Y/47W, 24L/33Y /47W/228L/241A, 24L/33Y/100E/108V/241A, 24L/47V/95S/100E, 24L/47V/95S/100E/160G/228L/233S/241A, 24L/47V/185A/216M/218V, 24L/47V /216M、24L/95S/183L、24L/100E/160G/233S、24L/160G/183L/185A、24L/189R/228L/233S、33Y/47V/95S/10 0E/233S, 33Y/47W/95S/100E/233S/241A, 33Y/47W/160G, 33Y/47W/233S, 33Y/100E/183L/185A, 33Y/100E/185A/233S, 47V, 47V/100E/111K/233S, 47V/100E/189R, 47V/108V/160G/228L/241A, 47V/111K, 47V/160G/185A/189R/233S, 47V/228L/233S, 47W/1 : 00E/189R/233S, 95S/100E/183L, 95S/100E/189R, 95S/100E/228L, 95S/100E/228L/233S, 95S/100E/233S, 100E/160G/185A, 100E/228L/233S, 108V, 108V/183L/189R/233S, 108V/185A/216M/228L/233S, 160G/233S, 228L and 228L/233S, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO:246. In some embodiments, the present invention provides an engineered uridine phosphorylase having a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 246, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the following at one or more positions in the polypeptide sequence: K3E/V24L/K33Y/H47V/P100E/V216M/I228L/T233S, K3E/V24L/K33Y/H47W/P100E/S183L/E185A , K3E/V24L/K33Y/P100E/S183L/E185A/I228L, K3E/V24L/K33Y/L108V, K3E/V24L/H47V/I228L, K3E/V24L/H47V/I228L/T233S, K3E/V24L/H47W/P 100E , K3E/V24L/H47W/P100E/L108V/T111K/D160G/E185A/T233S/M241A, K3E/V24L/H47W/L108V/D160G/M241A, K3E/V24L/H47W/D160G/A189R, K3E/V24 L/H47W/ A189R/I228L/T233S, K3E/V24L/T95S/P100E, K3E/V24L/T95S/P100E/D160G/A189R/I228L/M241A, K3E/V24L/P100E, K3E/V24L/P100E/D160G/G218 V/M241A, K3E/V24L/T111K, K3E/V24L/T111K/S183L/I228L/T233S/M241A , K3E/V24L/T111K/I228L/T233S, K3E/V24L/S183L/E185A/V216M, K3E/V24L/A189R/T233S, K3E/K33Y/H47V/P100E/T111K/I228L, K3E/K33Y/H47V/ P100 E/T111K/T233S/M241A, K3E/K33Y/H47V/P100E/V216M, K3E/K33Y/H47W/T95S/P100E/M241A, K3E/K33Y/H47W/P100E/L108V/A189R/V216M/I228L/T2 33S . 41A, K3E/H47V/T95S/D160G/A189R, K3E/H47V/P100E/D160G/E185A, K3E/H47W, K3E/H47W/T95S/P100E/L108V/A189R/T233S ,K3E/H47W/P100E/L108V/S183L/E185A/A189R/M241A , K3E/H47W/P100E/E185A/A189R/I228L, K3E/H47W/L108V/T111K, K3E/H47W/S183L/A189R/I228L/T233S, K3E/H47W/A189R, K3E/H47W/I228L/T23 3S, K3E/T95S/P100E/D160G/I228L/T233S, K3E/T95S/P100E/S183L/V216M/I228L/T233S, K3E/T95S/P100E/S183L/T233S, K3E/T95S/E185A/A18 9R/V216M, K3E/T95S/A189R, K3E/T95S/T233S, K3E/D160G, K3E/S183L/E185A/A189R/I228L/T233S, K3E/S183L/A189R/I228L/T233S, K3E/E185A, K3E/E185A/A189R, K3E/A189R, V24L, V24L/K33Y/H47W, V24L/K33Y/H47W/I228L/M241A, V24L/K33Y/P100E/L108V/M241A, V24L/H47V/T95S/P100E . G/T233S, V24L/D160G/S183L/E185A, V24L/A189R/I228L/T233S, K33Y/H47V/T95S/P100E/T233S, K33Y/H47W/T95S/P100E/T233S/M241A, K33Y/H47W/D160G, K33Y/H47W/T233S, K33Y/P100E/S183L/E185A, K33Y/P1 00E/E185A/ T233S, H47V, H47V/P100E/T111K/T233S, H47V/P100E/A189R, H47V/L108V/D160G/I228L/M241A, H47V/T111K, H47V/D160G/E185A/A189R/T233S, H47 V/I228L/T 233S, H47W/P100E/A189R/T233S, T95S/P100E/S183L, T95S/P100E/A189R, T95S/P100E/I228L, T95S/P100E/I228L/T233S, T95S/P100E/T233S, P10 0E/D160G/E 185A, P100E/I228L/T233S, L108V, L108V/S183L/A189R/T233S, L108V/E185A/V216M/I228L/T233S, D160G/T233S, I228L, and I228L/T233S, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO: 246. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 246. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 246. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 246.

在一些实施方案中,多肽序列与SEQ ID NO:594具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:6、6/9/29/40/100/121/126/179/181/189/237、6/9/121/179/181、6/46/52/63/97/121/126/179、6/52/180/181、6/63/126/179/242、9、9/40/46/97/100/106/135/179/181/207/231、9/52/126/189/242、9/97/100/106/126/180/207/231、9/181/242、29、40、46、52、52/63/126、52/179/189、61、63、97、100、106、121、126、126/180/189、135、142、179、180、181、189、201、207、230、231、236、237和242,其中所述多肽序列的氨基酸位置参考SEQ ID NO:594编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ IDNO:594具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:6S、6S/9M/29A/40N/100G/121G/126M/179R/181Q/189E/237V、6S/9M/121G/179R/181Q、6S/46Q/52S/63V/97T/121G/126M/179R、6S/52S/180L/181Q、 6S/63V/126M/179R/242I、 9M、9M/40N/46Q/97T/100G/106E/135D/179R/181Q/207S/231E 、9M/52S/126M/189E/242I、9M/97T/100G/106E/126M/180L/207S/231E、9M/181Q/242I、29A、40N、46Q、52S、52S/63V/126M、52S/179R/189E、61A、63V、97T、100G、106E、121G、126M、126M/180L/189E、135D、142A、179R、180L、181Q、189D、189E、201L、207S、230D、231E、236E、237V和242I,其中所述多肽序列的氨基酸位置参考SEQID NO:594编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:594具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:V6S、V6S/L9M/D29A/K40N/P100G/L121G/L126M/H179R/K181Q/R189E/T237V、V6S/L9M/L121G/H179R/K181Q 、V6S/S46Q/T52S/I63V/A97T/L121G/L126M/H179R 、V6S/T52S/F180L/K181Q、V6S/I63V/L126M/H179R/V242I、L9M、L9M/K40N/S46Q/A97T/P100G/D106E/E135D/H179R/K181Q/A207S/A231E、L9M/T52S/L126M/R189E/V242I、L9M/A97T/P100G/D106E/L126M/F180L/A207S/A231E、L9M/K181Q/V242I、D29A、K40N、S46Q、T52S、T52S/I63V/L126M、T52S/H179R/R189E、P61A、I63V、A97T、P100G、D106E、L121G、L126M、L126M/F180L/R189E、E135D、E142A、H179R、F180L、K181Q、R189D、R189E、T201L、A207S、N230D、A231E、Q236E、T237V和V242I,其中所述多肽序列的氨基酸位置参考SEQ ID NO:594编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ IDNO:594具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:594具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQID NO:594具有至少95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。In some embodiments, the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:594, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 6, 6/9/29/40/100/121/126/179/181/189/237, 6/9/121/179/181, 6/46/52/63/97/121/126/179, 6/52/180/181, 6/63/126/179/24 2, 9, 9/40/46/97/100/106/135/179/181/207/231, 9/52/126/189/242, 9/97/100/106/126/180/207/231, 9/181/242, 29, 40, 46, 52, 52/63/126, 52/179/189, 61, 63, 97, 100, 106, 121, 126, 126/180/189, 135, 142, 179, 180, 181, 189, 201, 207, 230, 231, 236, 237 and 242, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO:594. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO:594, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of 6S, 6S/9M/29A/40N/100G/121G/126M/179R/181Q/189E/237V, 6S/9M/121G/179R/181Q, 6S/46Q/52S/63V/97T/121G/126M/179R, 6S/52S/180L/181Q, 6S/63V/126M/179R/242I, 9M, 9M/40N/46Q/97T/100G/106E/135D/179R/181Q/207S/231E , 9M/52S/126M/189E/242I, 9M/97T/100G/106E/126M/180L/207S/231E, 9M/181Q/242I, 29A, 40N, 46Q, 52S, 52S/63V/126M, 52S/179R/189E, 61A, 63V, 97T, 100G, 106E, 121G, 126M, 126M/180L/189E, 135D, 142A, 179R, 180L, 181Q, 189D, 189E, 201L, 207S, 230D, 231E, 236E, 237V and 242I, wherein the amino acid positions of the polypeptide sequences are referenced to SEQ ID NO: 594. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 594, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of V6S, V6S/L9M/D29A/K40N/P100G/L121G/L126M/H179R/K181Q/R189E/T237V, V6S/L9M/L121G/H179R/K181Q ,V6S/S46Q/T52S/I63V/A97T/L121G/L126M/H179R , V6S/T52S/F180L/K181Q, V6S/I63V/L126M/H179R/V242I, L9M, L9M/K40N/S46Q/A97T/P100G/D106E/E135D/H179R/K181Q/A207S/A231E, L9M/T52 S/L126M/R189E/V242I, L9M/A97T/P100G/D106E/L126M/F180L/A207S/A231E, L9M/K181Q/V242I, D29A, [0147] The invention relates to polypeptides of the invention wherein the amino acid positions of the polypeptides are numbered with reference to SEQ ID NO: 594. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 594. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 594. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 594.

在一些实施方案中,多肽序列与SEQ ID NO:776具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:3、3/8/22/36/181/235/250、6、6/45/51/81/126/226/233、6/45/51/144、6/45/51/188/189、6/45/51/189/208、6/45/51/189/228/236、6/45/51/208/233、6/45/149/188/189/208/233/236、6/51/126/144/208、6/51/126/189/208/233/236、6/51/126/189/231/236、6/51/126/189/233、6/51/188/189/208/226/228、6/51/188/189/236、6/51/189/208/231/233、6/51/208/226/233、6/51/208/231、6/126/188/231/233、6/144/208、6/188/189/208/228/233、8、8/36/143/147/235、8/36/181/235、8/142/147、8/147/181/250、8/147/235、9、19、20、22、22/147/181/235/250、24、36、36/143/147/181、36/143/147/235、40、41、43、45、45/51、45/51/126/144/208/226/228、45/51/144、45/51/144/208/226/231/233、45/51/188/189、45/51/189/233、45/51/208/226/231、45/51/208/233、45/51/226、45/126/189、45/126/189/208/226、45/144/189/228、45/144/226/231/233、45/188/189、45/188/189/208/228、45/188/189/226/228、45/188/189/231/233、45/189、45/189/208、46、51、51/126/144/208、51/126/144/226/231/233/236、51/126/208、51/144、51/144/226、51/188/189/228、51/189、51/189/208/226、51/208、51/233、57、58、80、80/135/147、81、81/126/144/188/208/228、86、103、126、126/144/188/189/226、134、135、141、142、143、143/147/235、144、144/188/228、146、147、149、181、188、188/189/233、189、207、208、208/226/233、208/228、208/231/233、226、228、230、231、232、233、235、236、240和250,其中所述多肽序列的氨基酸位置参考SEQ ID NO:776编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:776具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:3N、3N/8A/22G/36I/181R/235G/250A、6T、6T/45G/51I/81W/126Y/226G/233D、6T/45G/51I/144G、6T/45G/51I/188R/189Y、6T/45G/51I/189V/228M/236W、6T/45G/51I/189Y/208V、6T/45G/51I/208V/233D、6T/45G/149T/188R/189Y/208V/233D/236G、6T/51I/126Y/144G/208V、6T/51I/126Y/189V/233D、6T/51I/126Y/189Y/208V/233D/236G、6T/51I/126Y/189Y/231V/236W、6T/51I/188R/189Y/208V/226G/228M、6T/51I/188R/189Y/236G、6T/51I/189Y/208V/231V/233D、6T/51I/208V/226G/233D、6T/51I/208V/231V、6T/126Y/188R/231V/233D、6T/144G/208V、6T/188R/189Y/208V/228M/233D、8A、8A/36I/143G/147M/235G、8A/36I/181R/235G、8A/142L/147C、8A/147C/181R/250A、8A/147M/235G、9V、19V、20L、22G、22G/147C/181R/235G/250A、24V、36I、36I/143G/147C/181R、36I/143G/147C/235G、40L、40V、41G、43P、45G、45G/51I、45G/51I/126Y/144G/208V/226G/228M 、 45G/51I/144G 、45G/51I/144G/208V/226G/231V/233D、 45G/51I/188R/189V、45G/51I/189V/233D、45G/51I/208V/226G/231V、45G/51I/208V/233D、45G/51I/226G、45G/126Y/189Y、45G/126Y/189Y/208V/226G、45G/144G/189V/228M、45G/144G/226G/231V/233D、45G/188R/189Y、45G/188R/189Y/208V/228M、45G/188R/189Y/226G/228M、45G/188R/189Y/231V/233D、45G/189V、45G/189V/208V、45G/189Y、45T、46Q、51I、51I/126Y/144G/208V、51I/126Y/144G/226G/231V/233D/236W、51I/126Y/208V、51I/144G、51I/144G/226G、51I/188R/189Y/228M、51I/189V/208V/226G、51I/189Y、51I/208V、51I/233D、57S、57T、58T、80M、80M/135V/147C、81W、81W/126Y/144G/188R/208V/228M、86L、103G、126L、126Q、126V、126Y、126Y/144G/188R/189V/226G、134L、135V、141L、142I、142L、143G、143G/147C/235G、144G、144G/188R/228M、146V、147C、147M、149F、181R、188R、188R/189V/233D、189T、189V、189Y、207C、207G、208V、208V/226G/233D、208V/228M、208V/231V/233D、226G、228M、230E、231V、232S、233D、235G、235P、236A、236G、236I、236W、240F、240W和250A,其中所述多肽序列的氨基酸位置参考SEQ ID NO:776编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:776具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:E3N、E3N/H8A/A22G/A36I/Q181R/K235G/R250A、S6T、S6T/A45G/T51I/L81W/M126Y/Q226G/S233D、S6T/A45G/T51I/A144G、S6T/A45G/T51I/Q188R/E189Y、S6T/A45G/T51I/E189V/L228M/Q236W、S6T/A45G/T51I/E189Y/S208V、S6T/A45G/T51I/S208V/S233D、S6T/A45G/A149T/Q188R/E189Y/S208V/S233D/Q236G、S6T/T51I/M126Y/A144G/S208V、S6T/T51I/M126Y/E189V/S233D、S6T/T51I/M126Y/E189Y/S208V/S233D/Q236G 、S6T/T51I/M126Y/E189Y/A231V/Q236W 、S6T/T51I/Q188R/E189Y/S208V/Q226G/L228M、S6T/T51I/Q188R/E189Y/Q236G、S6T/T51I/E189Y/S208V/A231V/S233D、S6T/T51I/S208V/Q226G/S233D 、 S6T/T51I/S208V/A231V 、S6T/M126Y/Q188R/A231V/S233D 、S6T/A144G/S208V 、S6T/Q188R/E189Y/S208V/L228M/S233D 、 H8A 、H8A/A36I/A143G/I147M/K235G、 H8A/A36I/Q181R/K235G、H8A/E142L/I147C、H8A/I147C/Q181R/R250A、H8A/I147M/K235G、M9V、A19V、T20L、A22G、A22G/I147C/Q181R/K235G/R250A、L24V、A36I、A36I/A143G/I147C/Q181R、A36I/A143G/I147C/K235G、N40L、N40V、P41G、K43P、A45G、A45G/T51I、A45G/T51I/M126Y/A144G/S208V/Q226G/L228M、A45G/T51I/A144G、A45G/T51I/A144G/S208V/Q226G/A231V/S233D、A45G/T51I/Q188R/E189V 、 A45G/T51I/E189V/S233D 、A45G/T51I/S208V/Q226G/A231V、 A45G/T51I/S208V/S233D、A45G/T51I/Q226G 、 A45G/M126Y/E189Y 、A45G/M126Y/E189Y/S208V/Q226G、A45G/A144G/E189V/L228M、A45G/A144G/Q226G/A231V/S233D、A45G/Q188R/E189Y、A45G/Q188R/E189Y/S208V/L228M、A45G/Q188R/E189Y/Q226G/L228M、A45G/Q188R/E189Y/A231V/S233D、A45G/E189V、A45G/E189V/S208V、A45G/E189Y、A45T、S46Q、T51I、T51I/M126Y/A144G/S208V、T51I/M126Y/A144G/Q226G/A231V/S233D/Q236W、T51I/M126Y/S208V、T51I/A144G、T51I/A144G/Q226G、T51I/Q188R/E189Y/L228M、T51I/E189V/S208V/Q226G、T51I/E189Y、T51I/S208V、T51I/S233D、L57S、L57T、D58T、I80M、I80M/E135V/I147C、L81W、L81W/M126Y/A144G/Q188R/S208V/L228M、I86L、N103G、M126L、M126Q、M126V、M126Y、M126Y/A144G/Q188R/E189V/Q226G、F134L、E135V、V141L、E142I、E142L、A143G、A143G/I147C/K235G、A144G、A144G/Q188R/L228M、S146V、I147C、I147M、A149F、Q181R、Q188R、Q188R/E189V/S233D、E189T、E189V、E189Y、A207C、A207G、S208V、S208V/Q226G/S233D、S208V/L228M、S208V/A231V/S233D、Q226G、L228M、N230E、A231V、E232S、S233D、K235G、K235P、Q236A、Q236G、Q236I、Q236W、H240F、H240W和R250A,其中所述多肽序列的氨基酸位置参考SEQ ID NO:776编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:776具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:776具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:776具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。In some embodiments, the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 776, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 3, 3/8/22/36/181/235/250, 6, 6/45/51/81 /126/226/233、6/45/51/144、6/45/51/188/189、6/45/51/189/208、6/45/51/189/228/236、6/45/51/208/233、6/45/149/188/189/208/233/236、 6/51/126/144/208, 6/51/126/189/208/233/236, 6/ 51/126/189/231/236, 6/51/126/189/233, 6/51/188/189/208/226/228, 6/51/188/189/236, 6/51/189/208/231/233, 6/51/208/226/233, 6/51/ 208/231, 6/126/188/231/233, 6/144/208, 6/188/189 /208/228/233, 8, 8/36/143/147/235, 8/36/181/235, 8/142/147, 8/147/181/250, 8/147/235, 9, 19, 20, 22, 22/147/181/235/250, 24, 36, 36/143/ 147/181, 36/143/147/235, 40, 41, 43, 45, 45/51, 45/5 1/126/144/208/226/228, 45/51/144, 45/51/144/208/226/231/233, 45/51/188/189, 45/51/189/233, 45/51/208/226/231, 45/51/208/233, 45/ 51/226, 45/126/189, 45/126/189/208/226, 45/144/1 89/228, 45/144/226/231/233, 45/188/189, 45/188/189/208/228, 45/188/189/226/228, 45/188/189/231/233, 45/189, 45/189/208, 46, 51, 51/ 126/144/208, 51/126/144/226/231/233/236, 51/126/ 208, 51/144, 51/144/226, 51/188/189/228, 51/189, 51/189/208/226, 51/208, 51/233, 57, 58, 80, 80/135/147, 81, 81/126/144/188/208/228, 86 ,103,126,126/144/188/189/226,134,135,141,142, 143, 143/147/235, 144, 144/188/228, 146, 147, 149, 181, 188, 188/189/233, 189, 207, 208, 208/226/233, 208/228, 208/231/233, 226, 228, 230, 231, 232, 233, 235, 236, 240 and 250, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO: 776. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase is the same as SEQ ID NO: 776. NO:776 has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of 3N, 3N/8A/22G/36I/181R/235G/250A, 6T, 6T/45G/51I/81W/126Y/226G/233D, 6T/45G/51I/144G, 6T /45G/51I/188R/189Y, 6T/45G/51I/189V/228M/236W, 6T/45G/51I/189Y/208V, 6T/45G/51I/208V/233D, 6T/45G/149T/188R/189Y/208V/233D/236G , 6T/51I/126Y/144G/208V, 6T/51I/126Y/189V/233D, 6T/51I/126Y/189Y/208V/233D/236G, 6T/51I/126Y/18 9Y/231V/236W, 6T/51I/188R/189Y/208V/226G/228M, 6T/51I/188R/189Y/236G, 6T/51I/189Y/208V/231V/233D, 6T/51I/208V/226G/233D, 6T/51 I/208V/231V, 6T/126Y/188R/231V/233D, 6T/144G/208V, 6T/188R/189Y/208V/228M/233D, 8A, 8A/36I/143G/14 7M/235G, 8A/36I/181R/235G, 8A/142L/147C, 8A/147C/181R/250A, 8A/147M/235G, 9V, 19V, 20L, 22G, 22G/147C/181R/235G/250A, 24V, 36I, 36I/1 43G/147C/181R, 36I/143G/147C/235G, 40L, 40V, 41G, 43P, 45G, 45G/51I, 45G/51I/126Y/144G/208V/226G/228M , 45G/51I/144G, 45G/51I/144G/208V/226G/231V/233D, 45G/51I/188R/189V, 45G/51I/189V/233D, 45G/51I/208V/226G/231V, 45G/51I/208V/233D, 45G/51I/226G, 45G/126Y/189Y, 45G/126Y/189Y/208 V/226G, 45G/144G/189V/228M, 45G/144G/226G/231V/233D, 45G/188R/189Y, 45G/188R/189Y/208V/228M, 45 G/188R/189Y/226G/228M, 45G/188R/189Y/231V/233D, 45G/189V, 45G/189V/208V, 45G/189Y, 45T, 46Q, 51I, 51I/126Y/144G/208V, 51I/126Y/144G /226G/231V/233D/236W, 51I/126Y/208V, 51I/144G, 51I/144G/226G, 51I/188R/189Y/228M, 51I/189V/208V /226G, 51I/189Y, 51I/208V, 51I/233D, 57S, 57T, 58T, 80M, 80M/135V/147C, 81W, 81W/126Y/144G/188R/208V/228M, 86L, 103G, 126L, 126Q, 126V, 12 6Y, 126Y/144G/188R/189V/226G, 134L, 135V, 141L, 142I, 142L, 143G, 143G/147C/235G, 144G, 144G/188R/22 8M, 146V, 147C, 147M, 149F, 181R, 188R, 188R/189V/233D, 189T, 189V, 189Y, 207C, 207G, 208V, 208V/226G/233D, 208V/228M, 208V/231V/233D, 226G, 228M, 230E, 231V, 232S, 233D, 235G, 235P, 236A, 236G, 236I, 236W, 240F, 240W, and 250A, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO: 776. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase is the same as SEQ ID NO: 776. NO:776 has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of E3N, E3N/H8A/A22G/A36I/Q181R/K235G/R250A, S6T, S6T/A45G/T51I/L81W/M126Y/Q226G/S233D, S6T/A45G/T51I/A144G, S6T/A45G /T51I/Q188R/E189Y, S6T/A45G/T51I/E189V/L228M/Q236W, S6T/A45G/T51I/E189Y/S208V, S6T/A45G/T51I/S208V/S233D, S6T/A45G/A149T/Q188R /E189Y/S208V/S233D/Q236G, S6T/T51I/M126Y/A144G/S208V, S6T/T51I/M126Y/E189V/S233D, S6T/T51I/M126Y/E189Y/S208V/S233D/Q236G . V/A231V/S233D, S6T/T51I/S208V/Q226G/S233D, S6T/T51I/S208V/A231V, S6T/M126Y/Q188R/A231V/S233D, S6T/A144G/S208V . 250A, H8A/I147M/K235G, M9V, A19V, T20L, A22G, A22G/I147C/Q181R/K235G/R250A, L24V, A36I, A36I/A143G/I147C/Q181R, A36I/A143 G/I147C/K235G, N40L, N40V, P41G, K43P, A45G, A45G/T51I, A45G/T51I/M126Y/A144G/S208V/Q226G/L228M, A45G/T51I/A144G, A45G/T51I/A144G/S2 08V/Q226G/A231V/S233D, A45G/T51I/Q188R/E189V , A45G/T51I/E189V/S233D, A45G/T51I/S208V/Q226G/A231V, A45G/T51I/S208V/S233D, A45G/T51I/Q226G, A45G/M126Y/E189Y , A45G/M126Y/E189Y/S208V/Q226G, A45G/A144G/E189V/L228M, A45G/A144G/Q226G/A231V/S233D, A45G/Q188R/E189Y, A45G/Q188R/E189Y/S208V /L228M、A45G/Q188R/E189Y/Q226G/L228M、A45G/Q188R/E189Y/A231V/S233D、A45G/E189V、A45G/E189V/S208V、A45G/E189 Y, A45T, S46Q, T51I, T51I/M126Y/A144G/S208V, T51I/M126Y/A144G/Q226G/A231V/S233D/Q236W, T51I/M126Y/S208V, T51I/A144G, T51I/A144G/Q22 6G, T51I/Q188R/E189Y/L228M, T51I/E189V/S208V/Q226G, T51I/E189Y, T51I/S208V, T51I/S233D, L57S, L57T, D58T, I80M , I80M/E135V/I147C, L81W, L81W/M126Y/A144G/Q188R/S208V/L228M, I86L, N103G, M126L, M126Q, M126V, M126Y, M126Y/A144G/Q188R/E189V/Q226 G, F134L, E135V, V141L, E142I, E142L, A143G, A143G/I147C/K235G, A144G, A144G/Q188R/L228M, S146V, I147C, I147M, A149F , Q181R, Q188R, Q188R/E189V/S233D, E189T, E189V, E189Y, A207C, A207G, S208V, S208V/Q226G/S233D, S208V/L228M, S208V/A231V/S233D, Q226G, L228M, N230E, A231V, E232S, S233D, K235G, K235P, Q236A, Q236G, Q236I, Q236W, H240F, H240W and R250A, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID NO:776. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 776. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 776. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 776.

在一些实施方案中,多肽序列与SEQ ID NO:868具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中选自以下的一个或更多个位置处包含至少一个取代或取代集:6、6/8/9/126/147、6/9/24/181/189、6/9/181/189/235、6/9/208/233/235、6/24、6/24/43/46/181/189、6/24/43/126/147/189、6/24/46/103/181/208、6/24/46/147/240、6/24/103/189、6/24/126/189、6/24/147、6/46/126/147/181/235/240、6/46/147/189/240、6/103、6/103/147/230/233、6/103/189/235、6/126/181/189、6/126/181/189/235、6/126/233/235、6/181/230/233/235、9/43/46/103/189/233/240、9/46/126/147/181、9/46/147/233、24/43/46/147/230/235、24/46/126、24/46/147/189、24/46/208/230/233/235、24/103/126/147、24/103/126/147/181/189/208/233、24/147、24/147/189/230/233、24/181/189/230/233/235、24/189/230、24/208、43/46/126/147/189/240、43/103/189/208/233、103/126/189/233/235、103/147/181、147/181/233、189、189/235和208/233,其中所述多肽序列的氨基酸位置参考SEQ ID NO:868编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:868具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中的一个或更多个位置处包含选自以下的至少一个取代或取代集:6T、6T/8A/9V/126Y/147C、6T/9V/24V/181R/189E、6T/9V/181R/189E/235P、6T/9V/208V/233D/235P、6T/24V、6T/24V/43P/46Q/181R/189E、6T/24V/43P/126Y/147C/189E、6T/24V/46Q/103G/181R/208V、6T/24V/46Q/147C/240F、6T/24V/103G/189E、6T/24V/126Y/189E、6T/24V/147C、6T/46Q/126Y/147C/181R/235P/240F、6T/46Q/147C/189E/240F、6T/103G、6T/103G/147C/230E/233D、6T/103G/189E/235P、6T/126Y/181R/189E、6T/126Y/181R/189E/235P、6T/126Y/233D/235P、6T/181R/230E/233D/235P、9V/43P/46Q/103G/189E/233D/240W、9V/46Q/126Y/147C/181R、9V/46Q/147C/233D、24V/43P/46Q/147C/230E/235P、24V/46Q/126Y、24V/46Q/147C/189E、24V/46Q/208V/230E/233D/235P、24V/103G/126Y/147C、24V/103G/126Y/147C/181R/189E/208V/233D、24V/147C、24V/147C/189E/230E/233D、24V/181R/189E/230E/233D/235P、24V/189E/230E、24V/208V、43P/46Q/126Y/147C/189E/240F、43P/103G/189E/208V/233D、103G/126Y/189E/233D/235P、103G/147C/181R、147C/181R/233D、189E、189E/235P和208V/233D,其中所述多肽序列的氨基酸位置参考SEQ ID NO:868编号。在一些实施方案中,工程化尿苷磷酸化酶的多肽序列与SEQ ID NO:868具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性,并且其中工程化尿苷磷酸化酶的多肽在所述多肽序列中一个或更多个位置处包含选自以下的至少一个取代或取代集:S6T、S6T/H8A/M9V/M126Y/I147C、S6T/M9V/L24V/Q181R/V189E、S6T/M9V/Q181R/V189E/K235P、S6T/M9V/S208V/S233D/K235P、S6T/L24V、S6T/L24V/K43P/S46Q/Q181R/V189E 、S6T/L24V/K43P/M126Y/I147C/V189E 、S6T/L24V/S46Q/N103G/Q181R/S208V、S6T/L24V/S46Q/I147C/H240F、S6T/L24V/N103G/V189E、S6T/L24V/M126Y/V189E、S6T/L24V/I147C、S6T/S46Q/M126Y/I147C/Q181R/K235P/H240F 、S6T/S46Q/I147C/V189E/H240F 、 S6T/N103G 、S6T/N103G/I147C/N230E/S233D、 S6T/N103G/V189E/K235P、S6T/M126Y/Q181R/V189E、S6T/M126Y/Q181R/V189E/K235P、S6T/M126Y/S233D/K235P、 S6T/Q181R/N230E/S233D/K235P、M9V/K43P/S46Q/N103G/V189E/S233D/H240W 、M9V/S46Q/M126Y/I147C/Q181R、 M9V/S46Q/I147C/S233D、L24V/K43P/S46Q/I147C/N230E/K235P、 L24V/S46Q/M126Y、L24V/S46Q/I147C/V189E、L24V/S46Q/S208V/N230E/S233D/K235P、L24V/N103G/M126Y/I147C、L24V/N103G/M126Y/I147C/Q181R/V189E/S208V/S233D、L24V/I147C、L24V/I147C/V189E/N230E/S233D、L24V/Q181R/V189E/N230E/S233D/K235P、L24V/V189E/N230E、L24V/S208V、K43P/S46Q/M126Y/I147C/V189E/H240F、K43P/N103G/V189E/S208V/S233D、N103G/M126Y/V189E/S233D/K235P、N103G/I147C/Q181R、I147C/Q181R/S233D、V189E、V189E/K235P和S208V/S233D,其中所述多肽序列的氨基酸位置参考SEQ ID NO:868编号。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:868具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:868具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包含与SEQ ID NO:868具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多肽序列。In some embodiments, the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 868, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 6, 6/8/9/126/147, 6/9/24/181/189, 6/9/181/189/235, 6/9/208/233/235, 6/24/24/181/189/235. /43/46/181/189, 6/24/43/126/147/189, 6/24/46/103/181/208, 6/24/46/147/240, 6/24/103/189, 6/24/126/189, 6/24/147, 6/46/126/147/181/ 235/240, 6/46/147/189/240, 6/103, 6/103/147/230/233, 6/103/189/235, 6/126/181/189, 6/1 26/181/189/235, 6/126/233/235, 6/181/230/233/235, 9/43/46/103/189/233/240, 9/46/126/147/181, 9/46/147/233, 24/43/46/147/230/235 ,24/46/126, 24/46/147/189, 24/46/208/230/233/235, 24/103/126/147, 24/103/126/147/181/ 189/208/233, 24/147, 24/147/189/230/233, 24/181/189/230/233/235, 24/189/230, 24/208, 43/46/126/147/189/240, 43/103/189/208/233, 103/126/189/233/235, 103/147/181, 147/181/233, 189, 189/235 and 208/233, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO: 868. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase is the same as SEQ ID NO: 868. NO:868 has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of: 6T, 6T/8A/9V/126Y/147C, 6T/9V/24V/181R/189E, 6T/9V/181R/189E/235P, 6T/9V/208V/233D/235P, 6T/24V, 6T/24V/43P/46Q/181R/189E, 6T/24V/43P/46Q/181R/189E, 6T/24V/43P/46Q/181R/189E, 6T/24V/43P/46Q/181R/189E 24V/43P/126Y/147C/189E, 6T/24V/46Q/103G/181R/208V, 6T/24V/46Q/147C/240F, 6T/24V/103G/189E, 6T/24V/126Y/189E, 6T/24V/147C, 6T/46 Q/12 6Y/147C/181R/235P/240F, 6T/46Q/147C/189E/240F, 6T/103G, 6T/103G/147C/230E/233D, 6T/103G/189E/235P, 6T/126Y/181R/189E, 6T/126Y/1 81R/ 189E/235P, 6T/126Y/233D/235P, 6T/181R/230E/233D/235P, 9V/43P/46Q/103G/189E/233D/240W, 9V/46Q/126Y/147C/181R, 9V/46Q/147C/233D, 24V/ 43P/46Q/147C/230E/235P, 24V/46Q/126Y, 24V/46Q/147C/189E, 24V/46Q/208V/230E/233D/235P, 24V/103G/126Y/147C, 24V/103G/126Y/147C/1 81R/ 189E/208V/233D, 24V/147C, 24V/147C/189E/230E/233D, 24V/181R/189E/230E/233D/235P, 24V/189E/230E, 24V/208V, 43P/46Q/126Y/147C/189E/240F, 43P/103G/189E/208V/233D, 103G/126Y/189E/233D/235P, 103G/147C/181R, 147C/181R/233D, 189E, 189E/235P and 208V/233D, wherein the amino acid positions of the polypeptide sequences refer to SEQ ID NO: 868. In some embodiments, the polypeptide sequence of the engineered uridine phosphorylase is the same as SEQ ID NO:868 has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity, and wherein the polypeptide of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions selected from the group consisting of S6T, S6T/H8A/M9V/M126Y/I147C, S6T/M9V/L24V/Q181R/V189E, S6T/M9V/Q181R/V189E/K235P, S6T/M9V/S208V/S233D/K235P, S6T/L24V, S6T/L24V/K43P/S46Q/Q181R/V189E , S6T/L24V/K43P/M126Y/I147C/V189E, S6T/L24V/S46Q/N103G/Q181R/S208V, S6T/L24V/S46Q/I147C/H240F, S6T/L24V/N103G/V189E, S6T/L24 V/M126Y/V189E, S6T/L24V/I147C, S6T/S46Q/M126Y/I147C/Q181R/K235P/H240F, S6T/S46Q/I147C/V189E/H240F, S6T/N103G , S6T/N103G/I147C/N230E/S233D, S6T/N103G/V189E/K235P, S6T/M126Y/Q181R/V189E, S6T/M126Y/Q181R/V189E/K235P, , S6T/Q181R/N230E/S233D/K235P, M9V/K43P/S46Q/N103G/V189E/S233D/H240W, M9V/S46Q/M126Y/I147C/Q181R, M9V/S46Q/I147C/S233D, L24V/K43P/S46Q/I147C/N230E/K235P, L24V/S46Q/M126Y, L24V/S46Q/I147C/V189E, L24V/S46Q/S208V/N230E/S233D/K235P, L24V/N103G/M126Y/I147C, L24V/N103G/M126Y/I147C/Q181R /V189E/S208V/S233D、L24V/I147C、L24V/I147C/V189E/N230E/S233D、L24V/Q181R/V189E/N230E/S233D/ K235P, L24V/V189E/N230E, L24V/S208V, K43P/S46Q/M126Y/I147C/V189E/H240F, K43P/N103G/V189E/S208V/S233D, N103G/M126Y/V189E/S233D/K235P, N103G/I147C/Q181R, I147C/Q181R/S233D, V189E, V189E/K235P and S208V/S233D, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO:868. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 868. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 868. In some embodiments, the engineered uridine phosphorylase comprises a polypeptide sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 868.

在一些另外的实施方案中,本发明提供了工程化尿苷磷酸化酶,其中工程化尿苷磷酸化酶包含与表1.2、表2.2、表3.1、表4.1、表5.2和/或表6.1中所列的至少一种工程化尿苷磷酸化酶变体的序列至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多相同的多肽序列。In some additional embodiments, the present invention provides engineered uridine phosphorylases, wherein the engineered uridine phosphorylase comprises a polypeptide sequence that is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered uridine phosphorylase variant listed in Table 1.2, Table 2.2, Table 3.1, Table 4.1, Table 5.2 and/or Table 6.1.

在一些另外的实施方案中,本发明提供了工程化尿苷磷酸化酶,其中工程化尿苷磷酸化酶包含与SEQ ID NO:2、SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID NO:868至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多相同的多肽序列。在一些实施方案中,工程化尿苷磷酸化酶包括SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID NO:868中所列的变体工程化尿苷磷酸化酶。In some additional embodiments, the invention provides engineered uridine phosphorylases, wherein the engineered uridine phosphorylase comprises a polypeptide sequence that is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID NO: 868. In some embodiments, the engineered uridine phosphorylase comprises a variant engineered uridine phosphorylase listed in SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID NO: 868.

本发明还提供了工程化尿苷磷酸化酶,其中工程化尿苷磷酸化酶包含与SEQ IDNO:4-1196中偶数编号序列中所列的至少一种工程化尿苷磷酸化酶变体的序列至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多相同的多肽序列。The present invention also provides engineered uridine phosphorylases, wherein the engineered uridine phosphorylase comprises a polypeptide sequence that is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered uridine phosphorylase variant listed in the even-numbered sequences of SEQ ID NOs: 4-1196.

本发明还提供了工程化尿苷磷酸化酶,其中与野生型大肠杆菌(Escherichiacoli)尿苷磷酸化酶相比,所述工程化尿苷磷酸化酶包含至少一种改进的性质。在一些实施方案中,改进的性质包括改进的对底物的活性。在一些另外的实施方案中,底物包括5’-异丁酰基核糖-1-磷酸(化合物(2))和尿嘧啶(化合物(3))。在一些另外的实施方案中,改进的性质包括改进的从化合物(2)和化合物(3)的化合物(1)产生。在又一些另外的实施方案中,工程化尿苷磷酸化酶是纯化的。本发明还提供了组合物,所述组合物包含本文提供的至少一种工程化尿苷磷酸化酶。The present invention also provides engineered uridine phosphorylases, wherein the engineered uridine phosphorylases comprise at least one improved property compared to wild-type Escherichia coli (Escherichiacoli) uridine phosphorylases. In some embodiments, the improved properties include improved activity on substrates. In some other embodiments, the substrates include 5'-isobutyryl ribose-1-phosphate (compound (2)) and uracil (compound (3)). In some other embodiments, the improved properties include improved production of compound (1) from compound (2) and compound (3). In some other embodiments, the engineered uridine phosphorylase is purified. The present invention also provides compositions comprising at least one engineered uridine phosphorylase provided herein.

本发明还提供了多核苷酸序列,所述多核苷酸序列编码本文提供的至少一种工程化尿苷磷酸化酶。在一些实施方案中,编码至少一种工程化尿苷磷酸化酶的多核苷酸序列包括与SEQ ID NO:1、SEQ ID NO:3、SEQ ID NO:245、SEQ ID NO:593、SEQ ID NO:775和/或SEQ ID NO:867具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多核苷酸序列。在一些实施方案中,编码至少一种工程化尿苷磷酸化酶的多核苷酸序列包括与SEQ ID NO:1、SEQ ID NO:3、SEQ ID NO:245、SEQ ID NO:593、SEQ ID NO:775和/或SEQ ID NO:867具有至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性的多核苷酸序列,其中所述工程化尿苷磷酸化酶的多核苷酸序列在一个或更多个位置处包含至少一个取代。在一些另外的实施方案中,编码至少一种工程化尿苷磷酸化酶或其功能片段的多核苷酸序列包括与SEQ ID NO:1、SEQ ID NO:3、SEQ ID NO:245、SEQ ID NO:593、SEQID NO:775和/或SEQ ID NO:867的至少85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的序列同一性。在又一些另外的实施方案中,多核苷酸序列可操作地连接至控制序列。在一些另外的实施方案中,多核苷酸序列是密码子优化的。在又一些另外的实施方案中,多核苷酸序列包括SEQ ID NO:1、SEQ ID NO:3、SEQ ID NO:245、SEQ ID NO:593、SEQ ID NO:775和/或SEQ ID NO:867中奇数编号序列中所列的多核苷酸序列。The present invention also provides polynucleotide sequences encoding at least one engineered uridine phosphorylase provided herein. In some embodiments, the polynucleotide sequence encoding at least one engineered uridine phosphorylase comprises a polynucleotide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 245, SEQ ID NO: 593, SEQ ID NO: 775 and/or SEQ ID NO: 867. In some embodiments, the polynucleotide sequence encoding at least one engineered uridine phosphorylase comprises a polynucleotide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 245, SEQ ID NO: 593, SEQ ID NO: 775, and/or SEQ ID NO: 867, wherein the polynucleotide sequence of the engineered uridine phosphorylase comprises at least one substitution at one or more positions. In some other embodiments, the polynucleotide sequence encoding at least one engineered uridine phosphorylase or its functional fragment comprises at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:245, SEQ ID NO:593, SEQ ID NO:775 and/or SEQ ID NO:867. In some other embodiments, the polynucleotide sequence is operably linked to a control sequence. In some other embodiments, the polynucleotide sequence is codon optimized. In some other embodiments, the polynucleotide sequence comprises a polynucleotide sequence listed in an odd-numbered sequence in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:245, SEQ ID NO:593, SEQ ID NO:775 and/or SEQ ID NO:867.

本发明还提供了表达载体,所述表达载体包含至少一种本文提供的多核苷酸序列。本发明还提供了包含至少一种本文提供的表达载体的宿主细胞。在一些实施方案中,本发明提供了包含本文提供的至少一种多核苷酸序列的宿主细胞。The present invention also provides an expression vector comprising at least one polynucleotide sequence provided herein. The present invention also provides a host cell comprising at least one expression vector provided herein. In some embodiments, the present invention provides a host cell comprising at least one polynucleotide sequence provided herein.

本发明还提供了在宿主细胞中产生工程化尿苷磷酸化酶的方法,所述方法包括在合适的条件下培养本文提供的宿主细胞,从而产生至少一种工程化尿苷磷酸化酶。在一些实施方案中,方法还包括从培养物和/或宿主细胞回收至少一种工程化尿苷磷酸化酶。在一些另外的实施方案中,方法还包括纯化所述至少一种工程化尿苷磷酸化酶的步骤。The present invention also provides a method for producing an engineered uridine phosphorylase in a host cell, the method comprising culturing a host cell provided herein under suitable conditions, thereby producing at least one engineered uridine phosphorylase. In some embodiments, the method further comprises recovering at least one engineered uridine phosphorylase from the culture and/or host cell. In some other embodiments, the method further comprises the step of purifying the at least one engineered uridine phosphorylase.

发明描述Description of the invention

本发明提供了工程化尿苷磷酸化酶(UP)、具有UP活性的多肽,和编码这些酶的多核苷酸,以及载体和包含这些多核苷酸和多肽的宿主细胞。还提供了用于产生UP酶的方法。本发明还提供了包含UP酶的组合物,以及使用工程化UP酶的方法。本发明尤其可用于药物化合物的产生。The present invention provides engineered uridine phosphorylase (UP), polypeptides having UP activity, and polynucleotides encoding these enzymes, as well as vectors and host cells comprising these polynucleotides and polypeptides. Also provided are methods for producing UP enzymes. The present invention also provides compositions comprising UP enzymes, and methods for using engineered UP enzymes. The present invention is particularly useful for the production of pharmaceutical compounds.

除非另外定义,否则本文使用的所有技术术语和科学术语通常具有与本发明所属领域普通技术人员通常理解的相同含义。通常,本文使用的命名法和下文描述的细胞培养、分子遗传学、微生物学、有机化学、分析化学和核酸化学中的实验程序是本领域熟知的并且普遍地采用的那些。这样的技术是熟知的,并且在本领域技术人员熟知的许多教科书和参考著作中进行了描述。对于化学合成和化学分析使用了标准技术或其修改形式。本文(上文和下文两者)提及的所有专利、专利申请、文章和出版物,特此通过引用明确并入本文。Unless otherwise defined, all technical terms and scientific terms used herein generally have the same meanings as those of ordinary skill in the art to which the present invention belongs. Generally, the nomenclature used herein and the experimental procedures in cell culture, molecular genetics, microbiology, organic chemistry, analytical chemistry and nucleic acid chemistry described below are those well known in the art and generally adopted. Such technology is well known and described in many textbooks and reference works well known to those skilled in the art. Standard techniques or their modified forms are used for chemical synthesis and chemical analysis. All patents, patent applications, articles and publications mentioned herein (both above and below) are hereby expressly incorporated herein by reference.

尽管本发明的实践中可使用与本文描述的方法和材料类似或等同的任何合适的方法和材料,但本文也描述了一些方法和材料。应理解,本发明不限于所描述的特定方法、方案和试剂,因为这些可以根据本领域技术人员使用它们的情况而改变。因此,下文紧接着定义的术语通过参考本发明作为整体而被更充分地描述。Although any suitable methods and materials similar or equivalent to those described herein may be used in the practice of the present invention, some methods and materials are also described herein. It should be understood that the present invention is not limited to the specific methods, protocols, and reagents described, as these may be varied according to the circumstances in which they are used by those skilled in the art. Therefore, the terms defined immediately below are more fully described by reference to the present invention as a whole.

应理解,上文的一般描述和下文的详细描述仅是示例性的和说明性的,而不是限制本发明。本文使用的章节标题仅用于组织目的,并且不被解释为限制所描述的主题。数值范围包括限定该范围的数字。因此,本文公开的每个数值范围意图涵盖落在这样的较宽数值范围内的每个较窄数值范围,如同这样的较窄数值范围在本文被全部明确地书写。还意图本文公开的每个最大的(或最小的)数值限制包括每个较低(或较高)的数值限制,如同这样的较低(或较高)数值限制在本文被明确地书写。It should be understood that the general description above and the detailed description below are only exemplary and illustrative, rather than limiting the present invention. The section titles used herein are only for organizational purposes and are not to be construed as limiting the subject matter described. Numerical ranges include numbers that limit the range. Therefore, each numerical range disclosed herein is intended to cover each narrower numerical range that falls within such a wider numerical range, as such a narrower numerical range is all explicitly written herein. It is also intended that each maximum (or minimum) numerical limit disclosed herein includes each lower (or higher) numerical limit, as such a lower (or higher) numerical limit is explicitly written herein.

缩写和定义Abbreviations and definitions

用于遗传编码的氨基酸的缩写是常规的,并且如下:丙氨酸(Ala或A)、精氨酸(Arg或R)、天冬酰胺(Asn或N)、天冬氨酸(Asp或D)、半胱氨酸(Cys或C)、谷氨酸(Glu或E)、谷氨酰胺(Gln或Q)、组氨酸(His或H)、异亮氨酸(Ile或I)、亮氨酸(Leu或L)、赖氨酸(Lys或K)、甲硫氨酸(Met或M)、苯丙氨酸(Phe或F)、脯氨酸(Pro或P)、丝氨酸(Ser或S)、苏氨酸(Thr或T)、色氨酸(Trp或W)、酪氨酸(Tyr或Y)和缬氨酸(Val或V)。The abbreviations for the genetically encoded amino acids are conventional and are as follows: alanine (Ala or A), arginine (Arg or R), asparagine (Asn or N), aspartic acid (Asp or D), cysteine (Cys or C), glutamic acid (Glu or E), glutamine (Gln or Q), histidine (His or H), isoleucine (Ile or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y), and valine (Val or V).

当使用三字母缩写时,除非前面具体地有“L”或“D”,或者从使用缩写的上下文清楚看出,否则氨基酸可以是关于α-碳(Cα)的L-构型或D-构型。例如,“Ala”表示丙氨酸而不指定关于α-碳的构型,而“D-Ala”和“L-Ala”分别表示D-丙氨酸和L-丙氨酸。当使用单字母缩写时,大写字母表示关于α-碳的L-构型的氨基酸,并且小写字母表示关于α-碳的D-构型的氨基酸。例如,“A”表示L-丙氨酸并且“a”表示D-丙氨酸。当多肽序列以一串单字母或三字母缩写(或其混合)呈现时,根据常规惯例将序列呈现为氨基(N)至羧基(C)方向。When three-letter abbreviations are used, unless specifically preceded by "L" or "D", or it is clear from the context in which the abbreviation is used, an amino acid can be in the L-configuration or the D-configuration about the α-carbon (Cα). For example, "Ala" represents alanine without specifying the configuration about the α-carbon, while "D-Ala" and "L-Ala" represent D-alanine and L-alanine, respectively. When single-letter abbreviations are used, capital letters represent amino acids in the L-configuration about the α-carbon, and lowercase letters represent amino acids in the D-configuration about the α-carbon. For example, "A" represents L-alanine and "a" represents D-alanine. When a polypeptide sequence is presented as a string of single-letter or three-letter abbreviations (or a mixture thereof), the sequence is presented as an amino (N) to carboxyl (C) direction according to conventional practice.

用于遗传编码核苷的缩写是常规的并且如下:腺苷(A);鸟苷(G);胞苷(C);胸苷(T);和尿苷(U)。除非具体描述,否则缩写的核苷可以是核糖核苷或2’-脱氧核糖核苷。核苷可以单独地或总体地指定为核糖核苷或2’-脱氧核糖核苷。当核酸序列以单字母缩写串表示时,序列按照常规惯例呈现为5’至3’方向,并且不示出磷酸。Abbreviations for genetically encoded nucleosides are conventional and are as follows: adenosine (A); guanosine (G); cytidine (C); thymidine (T); and uridine (U). Unless specifically described, abbreviated nucleosides may be ribonucleosides or 2'-deoxyribonucleosides. Nucleosides may be designated individually or collectively as ribonucleosides or 2'-deoxyribonucleosides. When a nucleic acid sequence is represented by a string of single-letter abbreviations, the sequence is presented in a 5' to 3' direction according to conventional convention, and the phosphate is not shown.

参考本发明,本文描述中使用的技术和科学术语将具有本领域普通技术人员通常理解的含义,除非另有具体定义。因此,以下术语旨在具有以下含义。With reference to the present invention, technical and scientific terms used in the description herein shall have the meanings commonly understood by one of ordinary skill in the art, unless specifically defined otherwise. Accordingly, the following terms are intended to have the following meanings.

除非上下文另外清楚地指示,否则如本文使用的单数形式“一(a)”、“一(an)”和“该(the)”包括复数指代物。因此,例如对“多肽(a polypeptide)”的提及包括多于一种多肽。As used herein, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a polypeptide" includes more than one polypeptide.

类似地,“包含(comprise、comprises、comprising)”、“包括(include、includes和including)”是可互换的,而不意图是限制性的。因此,如本文使用的,术语“包含(comprising)”及其同根词以其包含性含义被使用(即,等同于术语“包括(including)”及其相应的同根词)。Similarly, "comprise," "comprises," "comprising," "include," "includes," and "including" are interchangeable and are not intended to be limiting. Thus, as used herein, the term "comprising" and its cognates are used in their inclusive sense (i.e., equivalent to the term "including" and its corresponding cognates).

还应当理解,在各种实施方案的描述中使用术语“包含(comprising)”的情况下,本领域技术人员将理解,在一些特定情况下,可以使用“基本上由...组成(consistingessentially of)”或“由...组成(consisting of)”的语言可选择地描述实施方案。It should also be understood that where the term "comprising" is used in the description of various embodiments, those skilled in the art will understand that in some specific cases, the embodiments may be alternatively described using the language of "consisting essentially of" or "consisting of."

如本文使用的,术语“约”意指特定值的可接受误差。在一些实例中,“约”意指在给定值范围的0.05%、0.5%、1.0%或2.0%内。在一些实例中,“约”意指在给定值的1、2、3或4个标准偏差内。As used herein, the term "about" means an acceptable error for a particular value. In some instances, "about" means within 0.05%, 0.5%, 1.0%, or 2.0% of a given value. In some instances, "about" means within 1, 2, 3, or 4 standard deviations of a given value.

如本文使用的,“EC”编号是指生物化学和分子生物学国际联合命名委员会(Nomenclature Committee of the International Union of Biochemistry andMolecular Biology)(NC-IUBMB)的酶命名法。该IUBMB生化分类是基于酶催化的化学反应的酶数字分类系统。As used herein, "EC" numbers refer to the enzyme nomenclature of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). The IUBMB biochemical classification is a numerical classification system for enzymes based on the chemical reactions that they catalyze.

如本文使用的,“ATCC”是指美国典型培养物保藏中心(American Type CultureCollection),其生物保藏收集物包括基因和菌株。As used herein, "ATCC" refers to the American Type Culture Collection, whose biological deposit collections include genes and strains.

如本文使用的,“NCBI”是指美国国家生物信息中心(National Center forBiological Information)和其中提供的序列数据库。As used herein, "NCBI" refers to the National Center for Biological Information and the sequence databases provided therein.

如本文使用的,“尿苷磷酸化酶”(“UP”)是催化5’-异丁酰基核糖-1-磷酸(化合物(2))和尿嘧啶(化合物(3))和/或相关化合物向5’-异丁酰尿苷(化合物(1))和/或相关化合物的可逆转化的酶。UP酶可以是天然存在的,包括野生型大肠杆菌UP酶或存在于人类、细菌、真菌、植物或其他物种中的其他尿苷磷酸化酶或核苷转移酶,或者UP酶可以是由人类操作产生的工程化多肽。As used herein, "uridine phosphorylase" ("UP") is an enzyme that catalyzes the reversible conversion of 5'-isobutyrylribose-1-phosphate (compound (2)) and uracil (compound (3)) and/or related compounds to 5'-isobutyryluridine (compound (1)) and/or related compounds. The UP enzyme may be naturally occurring, including wild-type E. coli UP enzyme or other uridine phosphorylases or nucleoside transferases present in humans, bacteria, fungi, plants or other species, or the UP enzyme may be an engineered polypeptide produced by human manipulation.

“蛋白”、“多肽”和“肽”在本文中可互换地使用,来表示通过酰胺键共价连接的至少两个氨基酸的聚合物,而不论长度或翻译后修饰(例如糖基化或磷酸化)。该定义中包括D-氨基酸和L-氨基酸、以及D-氨基酸和L-氨基酸的混合物、以及包含D-氨基酸和L-氨基酸以及D-氨基酸和L-氨基酸的混合物的聚合物。"Protein," "polypeptide," and "peptide" are used interchangeably herein to refer to a polymer of at least two amino acids covalently linked by amide bonds, regardless of length or post-translational modifications (e.g., glycosylation or phosphorylation). Included in this definition are D- and L-amino acids, as well as mixtures of D- and L-amino acids, and polymers comprising mixtures of D- and L-amino acids and D- and L-amino acids.

“氨基酸”通过其通常已知的三字母符号或通过IUPAC-IUB生物化学命名委员会推荐的单字母符号在本文被提及。同样地,核苷酸可以通过其通常可接受的单字母代码被提及。"Amino acids" are referred to herein by either their commonly known three letter symbols or by the single-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Likewise, nucleotides may be referred to by their commonly accepted single-letter codes.

如本文使用的,“亲水氨基酸或残基”是指根据Eisenberg等人(Eisenberg等人,J.Mol.Biol.,179:125-142[1984])的归一化共有疏水性标度,具有表现出小于零的疏水性的侧链的氨基酸或残基。遗传编码的亲水氨基酸包括L-Thr(T)、L-Ser(S)、L-His(H)、L-Glu(E)、L-Asn(N)、L-Gln(Q)、L-Asp(D)、L-Lys(K)和L-Arg(R)。As used herein, "hydrophilic amino acids or residues" refer to amino acids or residues having side chains that exhibit a hydrophobicity less than zero according to the normalized consensus hydrophobicity scale of Eisenberg et al. (Eisenberg et al., J. Mol. Biol., 179: 125-142 [1984]). Genetically encoded hydrophilic amino acids include L-Thr (T), L-Ser (S), L-His (H), L-Glu (E), L-Asn (N), L-Gln (Q), L-Asp (D), L-Lys (K), and L-Arg (R).

如本文使用的,“酸性氨基酸或残基”是指当氨基酸被包含在肽或多肽中时,具有表现出小于约6的pKa值的侧链的亲水氨基酸或残基。由于失去氢离子,酸性氨基酸在生理pH通常具有带负电荷的侧链。遗传编码的酸性氨基酸包括L-Glu(E)和L-Asp(D)。As used herein, "acidic amino acids or residues" refer to hydrophilic amino acids or residues having side chains that exhibit a pKa value of less than about 6 when the amino acid is contained in a peptide or polypeptide. Acidic amino acids typically have negatively charged side chains at physiological pH due to loss of hydrogen ions. Genetically encoded acidic amino acids include L-Glu (E) and L-Asp (D).

如本文使用的,“碱性氨基酸或残基”是指当氨基酸被包含在肽或多肽中时,具有表现出大于约6的pKa值的侧链的亲水氨基酸或残基。由于与水合氢离子的缔合,碱性氨基酸在生理pH通常具有带正电荷的侧链。遗传编码的碱性氨基酸包括L-Arg(R)和L-Lys(K)。As used herein, "basic amino acid or residue" refers to a hydrophilic amino acid or residue having a side chain that exhibits a pKa value greater than about 6 when the amino acid is contained in a peptide or polypeptide. Basic amino acids generally have positively charged side chains at physiological pH due to association with hydronium ions. Genetically encoded basic amino acids include L-Arg (R) and L-Lys (K).

如本文使用的,“极性氨基酸或残基”是指具有在生理pH不带电荷但具有其中两个原子共同共有的电子对被其中一个原子更紧密地保持(held more closely)的至少一个键的侧链的亲水氨基酸或残基。遗传编码的极性氨基酸包括L-Asn(N)、L-Gln(Q)、L-Ser(S)和L-Thr(T)。As used herein, "polar amino acid or residue" refers to a hydrophilic amino acid or residue having a side chain that is uncharged at physiological pH but has at least one bond in which a pair of electrons shared by two atoms is held more closely by one of the atoms. Genetically encoded polar amino acids include L-Asn (N), L-Gln (Q), L-Ser (S), and L-Thr (T).

如本文使用的,“疏水氨基酸或残基”是指根据Eisenberg等人(Eisenberg等人,J.Mol.Biol.,179:125-142[1984])的归一化共有疏水性标度,具有表现出大于零的疏水性的侧链的氨基酸或残基。遗传编码的疏水氨基酸包括L-Pro(P)、L-Ile(I)、L-Phe(F)、L-Val(V)、L-Leu(L)、L-Trp(W)、L-Met(M)、L-Ala(A)和L-Tyr(Y)。As used herein, "hydrophobic amino acids or residues" refer to amino acids or residues having side chains that exhibit a hydrophobicity greater than zero according to the normalized consensus hydrophobicity scale of Eisenberg et al. (Eisenberg et al., J. Mol. Biol., 179: 125-142 [1984]). Genetically encoded hydrophobic amino acids include L-Pro (P), L-Ile (I), L-Phe (F), L-Val (V), L-Leu (L), L-Trp (W), L-Met (M), L-Ala (A), and L-Tyr (Y).

如本文使用的,“芳香族氨基酸或残基”是指具有包含至少一个芳香族环或杂芳香族环的侧链的亲水或疏水氨基酸或残基。遗传编码的芳香族氨基酸包括L-Phe(F)、L-Tyr(Y)和L-Trp(W)。尽管由于其杂芳香族氮原子的pKa,L-His(H)有时被归类为碱性残基,或因为其侧链包括杂芳香族环而被归类为芳香族残基,但在本文中,组氨酸被归类为亲水残基或为“受限残基(constrained residue)”(参见下文)。As used herein, "aromatic amino acid or residue" refers to a hydrophilic or hydrophobic amino acid or residue having a side chain comprising at least one aromatic or heteroaromatic ring. Genetically encoded aromatic amino acids include L-Phe (F), L-Tyr (Y), and L-Trp (W). Although L-His (H) is sometimes classified as a basic residue due to the pKa of its heteroaromatic nitrogen atom, or as an aromatic residue because its side chain includes a heteroaromatic ring, histidine is classified as a hydrophilic residue or as a "constrained residue" (see below) herein.

如本文使用的,“受限氨基酸或残基”是指具有受限几何形状的氨基酸或残基。本文中,受限残基包括L-Pro(P)和L-His(H)。组氨酸具有受限的几何形状,因为它具有相对小的咪唑环。脯氨酸具有受限的几何形状,因为它也具有五元环。As used herein, "constrained amino acid or residue" refers to an amino acid or residue with constrained geometry. Herein, constrained residues include L-Pro (P) and L-His (H). Histidine has a constrained geometry because it has a relatively small imidazole ring. Proline has a constrained geometry because it also has a five-membered ring.

如本文使用的,“非极性氨基酸或残基”是指具有在生理pH不带电荷并具有其中两个原子共同共有的电子对通常由两个原子各自同等地保持(即侧链不是极性的)的键的侧链的疏水氨基酸或残基。遗传编码的非极性氨基酸包括L-Gly(G)、L-Leu(L)、L-Val(V)、L-Ile(I)、L-Met(M)和L-Ala(A)。As used herein, "non-polar amino acid or residue" refers to a hydrophobic amino acid or residue having a side chain that is uncharged at physiological pH and has a bond in which an electron pair shared by two atoms is usually equally held by each of the two atoms (i.e., the side chain is not polar). Genetically encoded non-polar amino acids include L-Gly (G), L-Leu (L), L-Val (V), L-Ile (I), L-Met (M), and L-Ala (A).

如本文使用的,“脂肪族氨基酸或残基”是指具有脂肪族烃侧链的疏水氨基酸或残基。遗传编码的脂肪族氨基酸包括L-Ala(A)、L-Val(V)、L-Leu(L)和L-Ile(I)。值得注意的是,半胱氨酸(或“L-Cys”或“[C]”)是不常见的,因为它可以与其他L-Cys(C)氨基酸或其他含磺酰基或巯基的氨基酸形成二硫桥。“半胱氨酸样残基”包括半胱氨酸和含有可用于形成二硫桥的巯基部分的其他氨基酸。L-Cys(C)(和具有含-SH侧链的其他氨基酸)以还原的游离-SH或氧化的二硫桥接形式存在于肽中的能力影响L-Cys(C)对肽贡献净疏水特征还是亲水特征。虽然根据Eisenberg的归一化共有标度(Eisenberg等人,1984,上文),L-Cys(C)表现出0.29的疏水性,但是应当理解,为了本公开内容的目的,L-Cys(C)被分类为其自身独特的组。As used herein, "aliphatic amino acid or residue" refers to a hydrophobic amino acid or residue with an aliphatic hydrocarbon side chain. Genetically encoded aliphatic amino acids include L-Ala (A), L-Val (V), L-Leu (L), and L-Ile (I). It is noteworthy that cysteine (or "L-Cys" or "[C]") is unusual because it can form disulfide bridges with other L-Cys (C) amino acids or other sulfonyl or sulfhydryl-containing amino acids. "Cysteine-like residues" include cysteine and other amino acids containing sulfhydryl moieties that can be used to form disulfide bridges. The ability of L-Cys (C) (and other amino acids with -SH side chains) to be present in a peptide in a reduced free-SH or oxidized disulfide-bridged form affects whether L-Cys (C) contributes a net hydrophobic or hydrophilic character to the peptide. Although L-Cys(C) exhibits a hydrophobicity of 0.29 according to Eisenberg's normalized consensus scale (Eisenberg et al., 1984, supra), it is understood that for the purposes of the present disclosure, L-Cys(C) is classified into its own unique group.

如本文使用的,“小氨基酸或残基”是指具有包括总计三个或更少的碳原子和/或杂原子(不包括α-碳和氢)的侧链的氨基酸或残基。根据上文的定义,小氨基酸或残基可以被进一步分类为脂肪族、非极性、极性或酸性小氨基酸或残基。遗传编码的小氨基酸包括L-Ala(A)、L-Val(V)、L-Cys(C)、L-Asn(N)、L-Ser(S)、L-Thr(T)和L-Asp(D)。As used herein, "small amino acid or residue" refers to an amino acid or residue having a side chain comprising three or fewer carbon atoms and/or heteroatoms (excluding α-carbon and hydrogen). According to the above definition, small amino acids or residues can be further classified as aliphatic, non-polar, polar or acidic small amino acids or residues. Genetically encoded small amino acids include L-Ala (A), L-Val (V), L-Cys (C), L-Asn (N), L-Ser (S), L-Thr (T) and L-Asp (D).

如本文使用的,“含羟基的氨基酸或残基”是指含有羟基(-OH)部分的氨基酸。遗传编码的含羟基的氨基酸包括L-Ser(S)、L-Thr(T)和L-Tyr(Y)。As used herein, "hydroxyl-containing amino acid or residue" refers to an amino acid containing a hydroxyl (-OH) moiety. Genetically encoded hydroxyl-containing amino acids include L-Ser (S), L-Thr (T), and L-Tyr (Y).

如本文使用的,“多核苷酸”和“核酸”是指共价连接在一起的两个或更多个核苷酸。多核苷酸可以完全包含核糖核苷酸(即RNA)、完全包含2’脱氧核糖核苷酸(即DNA)或包含核糖核苷酸和2’脱氧核糖核苷酸的混合物。虽然核苷通常将经由标准磷酸二酯键连接在一起,但多核苷酸可以包含一个或更多个非标准键。多核苷酸可以是单链或双链的,或者可以包含单链区域和双链区域二者。此外,虽然多核苷酸通常将包含天然存在的编码核苷碱基(即腺嘌呤、鸟嘌呤、尿嘧啶、胸腺嘧啶和胞嘧啶),但它还可以包含一种或更多种经修饰和/或合成的核苷碱基,诸如例如肌苷、黄嘌呤、次黄嘌呤等。在一些实施方案中,这样的经修饰或合成的核苷碱基是编码氨基酸序列的核苷碱基。As used herein, "polynucleotide" and "nucleic acid" refer to two or more nucleotides covalently linked together. A polynucleotide may be completely comprised of ribonucleotides (i.e., RNA), completely comprised of 2' deoxyribonucleotides (i.e., DNA), or a mixture of ribonucleotides and 2' deoxyribonucleotides. Although nucleosides are usually linked together via standard phosphodiester bonds, polynucleotides may include one or more non-standard bonds. A polynucleotide may be single-stranded or double-stranded, or may include both single-stranded and double-stranded regions. In addition, although a polynucleotide may generally include naturally occurring coding nucleobases (i.e., adenine, guanine, uracil, thymine, and cytosine), it may also include one or more modified and/or synthesized nucleobases, such as, for example, inosine, xanthine, hypoxanthine, etc. In some embodiments, such modified or synthesized nucleobases are nucleobases encoding amino acid sequences.

如本文使用的,“核苷”是指包含核苷碱基(即含氮碱基)和5-碳糖(例如核糖或脱氧核糖)的糖基胺。核苷的非限制性实例包括胞苷、尿苷、腺苷、鸟苷、胸苷和肌苷。相比之下,术语“核苷酸”是指包含核苷碱基、5-碳糖和一个或更多个磷酸基团的糖基胺。在一些实施方案中,核苷可以被激酶磷酸化以产生核苷酸。As used herein, "nucleoside" refers to a glycosylamine comprising a nucleoside base (i.e., a nitrogenous base) and a 5-carbon sugar (e.g., ribose or deoxyribose). Non-limiting examples of nucleosides include cytidine, uridine, adenosine, guanosine, thymidine, and inosine. In contrast, the term "nucleotide" refers to a glycosylamine comprising a nucleoside base, a 5-carbon sugar, and one or more phosphate groups. In some embodiments, nucleosides can be phosphorylated by kinases to produce nucleotides.

如本文使用的,“核苷二磷酸”是指包含核苷碱基(即含氮碱基)、5-碳糖(例如核糖或脱氧核糖)和二磷酸(即焦磷酸)部分的糖基胺。在本文的一些实施方案中,“核苷二磷酸”缩写为“NDP”。核苷二磷酸的非限制性实例包括胞苷二磷酸(CDP)、尿苷二磷酸(UDP)、腺苷二磷酸(ADP)、鸟苷二磷酸(GDP)、胸苷二磷酸(TDP)和肌苷二磷酸(IDP)。在一些情形中,术语“核苷”和“核苷酸”可互换使用。As used herein, "nucleoside diphosphate" refers to a glycosylamine comprising a nucleoside base (i.e., a nitrogenous base), a 5-carbon sugar (e.g., ribose or deoxyribose) and a diphosphate (i.e., pyrophosphate) portion. In some embodiments herein, "nucleoside diphosphate" is abbreviated as "NDP". Non-limiting examples of nucleoside diphosphates include cytidine diphosphate (CDP), uridine diphosphate (UDP), adenosine diphosphate (ADP), guanosine diphosphate (GDP), thymidine diphosphate (TDP) and inosine diphosphate (IDP). In some cases, the terms "nucleoside" and "nucleotide" are used interchangeably.

如本文使用的,“编码序列”是指核酸(例如基因)编码蛋白的氨基酸序列的部分。As used herein, "coding sequence" refers to the portion of a nucleic acid (eg, a gene) that encodes the amino acid sequence of a protein.

如本文使用的,术语“生物催化(biocatalysis)”、“生物催化(biocatalytic)”、“生物转化”和“生物合成”是指使用酶来对有机化合物进行化学反应。As used herein, the terms "biocatalysis," "biocatalytic," "bioconversion," and "biosynthesis" refer to the use of enzymes to perform chemical reactions on organic compounds.

如本文使用的,“野生型”和“天然存在的”指在自然界中发现的形式。例如,野生型多肽或多核苷酸序列为生物体中存在的序列,其可从天然来源分离且未通过人为操作被有意识地修饰。As used herein, "wild-type" and "naturally occurring" refer to forms found in nature. For example, a wild-type polypeptide or polynucleotide sequence is a sequence present in an organism that can be isolated from a natural source and has not been intentionally modified by human manipulation.

如本文使用的,当关于细胞、核酸或多肽使用时,“重组”、“工程化”、“变体”和“非天然存在的”是指已经以自然界原本不存在的方式修饰的材料或相应于该材料的天然或自然形式的材料。在一些实施方案中,该细胞、核酸或多肽与天然存在的细胞、核酸或多肽相同,但由合成材料和/或通过使用重组技术操纵产生或衍生。非限制性实例包括,除其他以外,表达自然(非重组)形式的细胞中未发现的基因或表达原本以不同水平表达的自然基因的重组细胞。As used herein, "recombinant," "engineered," "variant," and "non-naturally occurring" when used with respect to cells, nucleic acids, or polypeptides refer to materials that have been modified in a manner that does not otherwise exist in nature or to materials that correspond to the native or native form of the material. In some embodiments, the cell, nucleic acid, or polypeptide is identical to a naturally occurring cell, nucleic acid, or polypeptide, but is produced or derived from synthetic materials and/or through manipulation using recombinant techniques. Non-limiting examples include, among others, recombinant cells that express genes not found in the natural (non-recombinant) form of the cell or that express natural genes that are otherwise expressed at different levels.

术语“序列同一性百分比(%)”在本文中用于指多核苷酸或多肽之间的比较,并通过比较比较窗中两条最佳对齐的序列确定,其中多核苷酸或多肽序列在比较窗中的部分与参考序列相比可以包含添加或缺失(即,空位),以用于两个序列的最佳对齐。百分比可以通过如下计算:确定两个序列中出现相同核酸碱基或氨基酸残基的位置的数目以产生匹配位置的数目,将匹配位置的数目除以比较窗中位置的总数目,并将结果乘以100以得到序列同一性的百分比。可选择地,百分比可以通过如下计算:确定两个序列中出现相同的核酸碱基或氨基酸残基或者核酸碱基或氨基酸残基与空位对齐的位置的数目以产生匹配位置的数目,将匹配位置的数目除以比较窗中位置的总数目,并将结果乘以100以得到序列同一性的百分比。本领域技术人员理解,存在许多可用于比对两个序列的已建立的算法。用于比较的序列的最佳比对可以通过任何合适的方法进行,包括但不限于Smith和Waterman的局部同源性算法(Smith和Waterman,Adv.Appl.Math.,2:482[1981]),通过Needleman和Wunsch的同源性比对算法(Needleman和Wunsch,J.Mol.Biol.,48:443[1970]),通过Pearson和Lipman的相似性搜索方法(Pearson和Lipman,Proc.Natl.Acad.Sci.USA 85:2444[1988]),通过这些算法的计算机化实现(例如,GCG Wisconsin软件包中的GAP、BESTFIT、FASTA和TFASTA),或者通过目视检查,如本领域已知的。适合于确定序列同一性和序列相似性百分比的算法的实例包括但不限于BLAST和BLAST 2.0算法,由Altschul等人描述(分别参见Altschul等人,J.Mol.Biol.,215:403-410[1990];和Altschul等人,Nucl.Acids Res.,3389-3402[1977])。公众可通过美国国家生物技术信息中心网站获得用于进行BLAST分析的软件。该算法包括首先通过鉴定查询序列中长度W的短字来鉴定高评分序列对(HSP),所述短字在与数据库序列中相同长度的字比对时匹配或满足某一正值的阀值评分T。T被称为邻近字评分阈值(参见,Altschul等人,上文)。这些最初的邻近字击中(word hit)充当启动搜索的种子以找到包含它们的更长HSP。然后字击中沿着每个序列的两个方向延伸直到累积比对评分不能增加的程度。对于核苷酸序列,累积评分使用参数M(用于匹配残基对的奖励评分;总是>0)和N(用于错配残基的惩罚评分;总是<0)计算。对于氨基酸序列,评分矩阵用于计算累积评分。在以下情况时,停止字击中在每一个方向的延伸:累积比对评分从其最大达到值下降了量X;由于累积了一个或更多个负评分残基比对,累积得分达到0或小于0;或到达任一序列末端。BLAST算法参数W、T和X决定比对的灵敏度和速度。BLASTN程序(对于核苷酸序列)使用以下作为默认值:字长(W)为11、期望值(E)为10、M=5、N=-4、以及两条链的比较。对于氨基酸序列,BLASTP程序使用以下作为默认值:字长(W)为3,期望值(E)为10和BLOSUM62评分矩阵(参见,Henikoff和Henikoff,Proc.Natl.Acad.Sci.USA 89:10915[1989])。序列比对与%序列同一性的示例性确定可以使用GCG Wisconsin软件包(Accelrys,Madison WI)中的BESTFIT或GAP程序,使用提供的默认参数。The term "percentage (%) of sequence identity" is used herein to refer to the comparison between polynucleotides or polypeptides, and is determined by comparing two optimally aligned sequences in a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may include additions or deletions (i.e., gaps) compared to the reference sequence for optimal alignment of the two sequences. The percentage can be calculated as follows: determine the number of positions where the same nucleic acid base or amino acid residue appears in the two sequences to produce the number of matching positions, divide the number of matching positions by the total number of positions in the comparison window, and multiply the result by 100 to obtain the percentage of sequence identity. Alternatively, the percentage can be calculated as follows: determine the number of positions where the same nucleic acid base or amino acid residue appears in the two sequences or where the nucleic acid base or amino acid residue is aligned with a gap to produce the number of matching positions, divide the number of matching positions by the total number of positions in the comparison window, and multiply the result by 100 to obtain the percentage of sequence identity. Those skilled in the art will appreciate that there are many established algorithms that can be used to align two sequences. Optimal alignment of sequences for comparison can be performed by any suitable method, including, but not limited to, the local homology algorithm of Smith and Waterman (Smith and Waterman, Adv. Appl. Math., 2:482 [1981]), by the homology alignment algorithm of Needleman and Wunsch (Needleman and Wunsch, J. Mol. Biol., 48:443 [1970]), by the similarity search method of Pearson and Lipman (Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 [1988]), by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin software package), or by visual inspection, as is known in the art. Examples of algorithms suitable for determining percentages of sequence identity and sequence similarity include, but are not limited to, BLAST and BLAST 2.0 algorithms, described by Altschul et al. (see, respectively, Altschul et al., J. Mol. Biol., 215: 403-410 [1990]; and Altschul et al., Nucl. Acids Res., 3389-3402 [1977]). Software for performing BLAST analysis is available to the public through the website of the National Center for Biotechnology Information. The algorithm involves first identifying high-scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that match or satisfy a certain positive threshold score T when aligned with a word of the same length in the database sequence. T is referred to as the neighborhood word score threshold (see, Altschul et al., supra). These initial neighborhood word hits serve as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence to the extent that the cumulative alignment score cannot be increased. For nucleotide sequences, the cumulative score is calculated using the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extensions in each direction are stopped when: the cumulative alignment score falls by the amount X from its maximum achieved value; the cumulative score reaches 0 or less due to the accumulation of one or more negative scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses the following as defaults: word length (W) of 11, expectation (E) of 10, M=5, N=-4, and comparison of both chains. For amino acid sequences, the BLASTP program uses as defaults a word length (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (See, Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 [1989]). Exemplary determinations of sequence alignments and % sequence identity can be made using the BESTFIT or GAP programs in the GCG Wisconsin software package (Accelrys, Madison WI) using the default parameters provided.

如本文使用的,“参考序列”是指用作序列和/或活性比较的基础的定义序列。参考序列可以是更大序列的子集,例如,全长基因或多肽序列的区段。通常,参考序列为至少20个核苷酸或氨基酸残基的长度、至少25个残基的长度、至少50个残基的长度、至少100个残基的长度或核酸或多肽的全长。因为两个多核苷酸或多肽可以各自(1)包含在两个序列之间相似的序列(即,完整序列的一部分),和(2)可以还包含在两个序列之间趋异的(divergent)序列,所以两个(或更多个)多核苷酸或多肽之间的序列比较通常通过比较两个多核苷酸或多肽在“比较窗”上的序列以鉴定和比较局部区域的序列相似性来进行。在一些实施方案中,“参考序列”可以基于一级氨基酸序列,其中参考序列是可以在一级序列中具有一个或更多个变化的序列。As used herein, "reference sequence" refers to a defined sequence used as a basis for sequence and/or activity comparison. A reference sequence can be a subset of a larger sequence, for example, a segment of a full-length gene or polypeptide sequence. Typically, a reference sequence is a length of at least 20 nucleotides or amino acid residues, a length of at least 25 residues, a length of at least 50 residues, a length of at least 100 residues, or the total length of a nucleic acid or polypeptide. Because two polynucleotides or polypeptides can each (1) be included in a sequence similar to the two sequences (i.e., a part of a complete sequence), and (2) can also be included in a divergent sequence between the two sequences, the sequence comparison between two (or more) polynucleotides or polypeptides is usually performed by comparing the sequences of the two polynucleotides or polypeptides on a "comparison window" to identify and compare the sequence similarity of a local region. In some embodiments, a "reference sequence" can be based on a primary amino acid sequence, wherein a reference sequence is a sequence that can have one or more variations in a primary sequence.

如本文使用的,“比较窗”是指至少约20个连续核苷酸位置或氨基酸残基的概念性区段,其中序列可以与至少20个连续核苷酸或氨基酸的参考序列进行比较,并且其中序列在比较窗中的部分与参考序列(其不包含添加或缺失)相比,可以包含20%或更少的添加或缺失(即,空位)以用于两个序列的最佳比对。比较窗可以比20个连续残基更长,并任选地包括30、40、50、100或更长的窗。As used herein, "comparison window" refers to a conceptual segment of at least about 20 consecutive nucleotide positions or amino acid residues, wherein a sequence can be compared to a reference sequence of at least 20 consecutive nucleotides or amino acids, and wherein the portion of the sequence in the comparison window may include 20% or less additions or deletions (i.e., gaps) compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences. The comparison window may be longer than 20 consecutive residues, and optionally includes windows of 30, 40, 50, 100 or more.

如本文使用的,当在对给定氨基酸或多核苷酸序列进行编号的情况中使用时,“对应于”、“参考”和“相对于”是指当给定氨基酸或多核苷酸序列与参考序列相比较时对指定参考序列的残基进行编号。换言之,给定聚合物的残基编号或残基位置关于参考序列被指定,而不是通过给定氨基酸或多核苷酸序列内残基的实际数字位置被指定。例如,给定氨基酸序列,诸如工程化尿苷磷酸化酶的氨基酸序列可以通过引入空位以与参考序列对齐,来优化两个序列之间的残基匹配。在这些情况中,尽管存在空位,对给定氨基酸或多核苷酸序列中的残基关于与其比对的参考序列进行编号。As used herein, "corresponding to," "reference," and "relative to," when used in the context of numbering a given amino acid or polynucleotide sequence, refer to numbering the residues of a given reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence. In other words, the residue numbers or residue positions of a given polymer are specified with respect to a reference sequence, rather than being specified by the actual numerical positions of the residues within the given amino acid or polynucleotide sequence. For example, given an amino acid sequence, such as an engineered uridine phosphorylase, the residue matching between the two sequences can be optimized by introducing gaps to align with the reference sequence. In these cases, the residues in a given amino acid or polynucleotide sequence are numbered with respect to the reference sequence to which it is aligned, despite the presence of gaps.

如本文使用的,“大体同一性”是指在至少20个残基位置的比较窗中、通常在至少30个-50个残基窗中,与参考序列相比,具有至少80%序列同一性、至少85%同一性、至少89%至95%之间序列同一性,或更通常至少99%序列同一性的多核苷酸或多肽序列,其中序列同一性的百分比通过在比较窗上比较参考序列和包含总计为参考序列的20%或更少的缺失或添加的序列来计算。在应用于多肽的一些具体实施方案中,术语“大体同一性”意指当诸如通过程序GAP或BESTFIT使用默认空位权重进行最佳比对时,两个多肽序列共享至少80%的序列同一性,优选地至少89%的序列同一性、至少95%的序列同一性或更高(例如,99%的序列同一性)。在一些实施方案中,在所比较的序列中不相同的残基位置因保守氨基酸取代而不同。As used herein, "substantially identical" refers to a polynucleotide or polypeptide sequence having at least 80% sequence identity, at least 85% identity, at least 89% to 95% sequence identity, or more generally at least 99% sequence identity in a comparison window of at least 20 residue positions, usually in a window of at least 30-50 residues, compared to a reference sequence, wherein the percentage of sequence identity is calculated by comparing the reference sequence and a sequence comprising a deletion or addition totaling 20% or less of the reference sequence over the comparison window. In some specific embodiments applied to polypeptides, the term "substantially identical" means that when the best alignment is performed using the default gap weights such as by the program GAP or BESTFIT, two polypeptide sequences share at least 80% sequence identity, preferably at least 89% sequence identity, at least 95% sequence identity or higher (e.g., 99% sequence identity). In some embodiments, the residue positions that are not identical in the compared sequences differ due to conservative amino acid substitutions.

如本文使用的,“氨基酸差异”和“残基差异”是指在多肽序列的一个位置处氨基酸残基相对于参考序列中对应位置处的氨基酸残基的差异。在一些情况下,参考序列具有组氨酸标签,但相对于没有组氨酸标签的等同参考序列,编号维持不变。本文中氨基酸差异的位置通常被称为“Xn”,其中n是指残基差异所基于的参考序列中的对应位置。例如,“与SEQID NO:4相比位置X93处的残基差异”是指对应于SEQ ID NO:4的位置93的多肽位置处的氨基酸残基的差异。因此,如果参考多肽SEQ ID NO:4在位置93处具有丝氨酸,则“与SEQ IDNO:4相比位置X93处的残基差异”是指在对应于SEQ ID NO:4的位置93的多肽位置处的除了丝氨酸以外的任何残基的氨基酸取代。在本文的大多数实例中,在一个位置处的具体氨基酸残基差异指示为“XnY”,其中“Xn”指定如上文描述的对应位置,并且“Y”是在工程化多肽中发现的氨基酸(即,与参考多肽中不同的残基)的单字母标识符。在一些实例中(例如,在实施例中呈现的表格中),本发明还提供由常规符号“AnB”表示的具体氨基酸差异,其中A为参考序列中的残基的单字母标识符,“n”为参考序列中的残基位置的编号,并且B为工程化多肽的序列中残基取代的单字母标识符。在一些实例中,本发明的多肽可以相对于参考序列包含一个或更多个氨基酸残基差异,其由相对于参考序列存在残基差异的一列指定位置指示。在一些实施方案中,在多于一个氨基酸可以用于多肽的具体残基位置中时,可以使用的各种氨基酸残基由“/”分开(例如,X307H/X307P或X307H/P)。斜线也可用于指示给定变体内的多于一个取代(即,在给定序列中诸如在组合变体中存在多于一个取代)。在一些实施方案中,本发明包括含有一个或更多个氨基酸差异的工程化多肽序列,所述氨基酸差异包括保守氨基酸取代或非保守氨基酸取代。在一些另外的实施方案中,本发明提供了包含保守氨基酸取代和非保守氨基酸取代二者的工程化多肽序列。As used herein, "amino acid difference" and "residue difference" refer to the difference of the amino acid residue at one position of the polypeptide sequence relative to the amino acid residue at the corresponding position in the reference sequence. In some cases, the reference sequence has a histidine tag, but the numbering remains unchanged relative to the equivalent reference sequence without a histidine tag. The position of the amino acid difference is generally referred to as "Xn" herein, where n refers to the corresponding position in the reference sequence on which the residue difference is based. For example, "the residue difference at position X93 compared to SEQ ID NO:4" refers to the difference of the amino acid residue at the polypeptide position corresponding to position 93 of SEQ ID NO:4. Therefore, if the reference polypeptide SEQ ID NO:4 has serine at position 93, "the residue difference at position X93 compared to SEQ ID NO:4" refers to the amino acid substitution of any residue except serine at the polypeptide position corresponding to position 93 of SEQ ID NO:4. In most examples herein, the specific amino acid residue difference at one position is indicated as "XnY", where "Xn" specifies the corresponding position as described above, and "Y" is a single letter identifier of the amino acid found in the engineered polypeptide (i.e., a residue different from the reference polypeptide). In some examples (e.g., in the tables presented in the Examples), the present invention also provides specific amino acid differences represented by the conventional symbol "AnB", where A is a single-letter identifier for a residue in a reference sequence, "n" is the number of a residue position in a reference sequence, and B is a single-letter identifier for a residue substitution in the sequence of an engineered polypeptide. In some examples, a polypeptide of the present invention may comprise one or more amino acid residue differences relative to a reference sequence, indicated by a column of designated positions where residue differences exist relative to a reference sequence. In some embodiments, when more than one amino acid can be used in a specific residue position of a polypeptide, the various amino acid residues that can be used are separated by "/" (e.g., X307H/X307P or X307H/P). Slashes can also be used to indicate more than one substitution within a given variant (i.e., there is more than one substitution in a given sequence such as in a combinatorial variant). In some embodiments, the present invention includes engineered polypeptide sequences containing one or more amino acid differences, the amino acid differences comprising conservative amino acid substitutions or non-conservative amino acid substitutions. In some additional embodiments, the present invention provides engineered polypeptide sequences comprising both conservative amino acid substitutions and non-conservative amino acid substitutions.

如本文使用的,“保守氨基酸取代”是指用具有相似侧链的不同残基取代残基,并且因此通常包括用相同或相似的氨基酸定义类别中的氨基酸取代多肽中的氨基酸。例如但不限于,在一些实施方案中,具有脂肪族侧链的氨基酸被另一种脂肪族氨基酸(例如,丙氨酸、缬氨酸、亮氨酸和异亮氨酸)取代;具有羟基侧链的氨基酸被另一种具有羟基侧链的氨基酸(例如,丝氨酸和苏氨酸)取代;具有芳香族侧链的氨基酸被另一种具有芳香族侧链的氨基酸(例如,苯丙氨酸、酪氨酸、色氨酸和组氨酸)取代;具有碱性侧链的氨基酸被另一种具有碱性侧链的氨基酸(例如,赖氨酸和精氨酸)取代;具有酸性侧链的氨基酸被另一种具有酸性侧链的氨基酸(例如,天冬氨酸或谷氨酸)取代;和/或疏水氨基酸或亲水氨基酸分别被另一种疏水氨基酸或亲水氨基酸取代。As used herein, "conservative amino acid substitutions" refer to the replacement of a residue with a different residue having a similar side chain, and thus generally include the replacement of an amino acid in a polypeptide with an amino acid in the same or similar amino acid defined class. For example, but not limited to, in some embodiments, an amino acid with an aliphatic side chain is replaced by another aliphatic amino acid (e.g., alanine, valine, leucine, and isoleucine); an amino acid with a hydroxyl side chain is replaced by another amino acid with a hydroxyl side chain (e.g., serine and threonine); an amino acid with an aromatic side chain is replaced by another amino acid with an aromatic side chain (e.g., phenylalanine, tyrosine, tryptophan, and histidine); an amino acid with a basic side chain is replaced by another amino acid with a basic side chain (e.g., lysine and arginine); an amino acid with an acidic side chain is replaced by another amino acid with an acidic side chain (e.g., aspartic acid or glutamic acid); and/or a hydrophobic amino acid or a hydrophilic amino acid is replaced by another hydrophobic amino acid or a hydrophilic amino acid, respectively.

如本文使用的,“非保守取代”是指用具有显著不同的侧链性质的氨基酸取代多肽中的氨基酸。非保守取代可以使用定义的组之间而不是之内的氨基酸,并且影响(a)取代区域中的肽主链的结构(例如,脯氨酸取代甘氨酸),(b)电荷或疏水性,或(c)侧链体积。例如但不限于,示例性非保守取代可以是用碱性或脂肪族氨基酸取代酸性氨基酸;用小氨基酸取代芳香族氨基酸;和用疏水氨基酸取代亲水氨基酸。As used herein, "non-conservative substitution" refers to the substitution of an amino acid in a polypeptide with an amino acid having significantly different side chain properties. Non-conservative substitutions may use amino acids between, rather than within, the defined groups, and affect (a) the structure of the peptide backbone in the region of the substitution (e.g., proline for glycine), (b) charge or hydrophobicity, or (c) side chain bulk. For example, but not limited to, exemplary non-conservative substitutions may be substitutions of acidic amino acids with basic or aliphatic amino acids; substitutions of aromatic amino acids with small amino acids; and substitutions of hydrophilic amino acids with hydrophobic amino acids.

如本文使用的,“缺失”是指通过从参考多肽去除一个或更多个氨基酸对多肽进行的修饰。缺失可以包括去除1个或更多个氨基酸、2个或更多个氨基酸、5个或更多个氨基酸、10个或更多个氨基酸、15个或更多个氨基酸或者20个或更多个氨基酸、多达组成参考酶的氨基酸总数的10%或多达氨基酸总数的20%,同时保留酶活性和/或保留工程化尿苷磷酸化酶的改进的性质。缺失可以涉及多肽的内部部分和/或末端部分。在各种实施方案中,缺失可以包括连续的区段或可以是不连续的。氨基酸序列中的缺失通常用“-”表示。As used herein, "deletion" refers to the modification of a polypeptide by removing one or more amino acids from a reference polypeptide. Deletion can include the removal of 1 or more amino acids, 2 or more amino acids, 5 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total number of amino acids that make up the reference enzyme, or up to 20% of the total number of amino acids, while retaining enzymatic activity and/or retaining the improved properties of the engineered uridine phosphorylase. Deletion can involve internal portions and/or terminal portions of a polypeptide. In various embodiments, deletion can include continuous segments or can be discontinuous. Deletions in amino acid sequences are generally represented by "-".

如本文使用的,“插入”是指通过将一个或更多个氨基酸添加到参考多肽对多肽进行的修饰。插入可以处于多肽的内部部分或者可以是插入到羧基或氨基末端。如本文使用的插入包括如本领域已知的融合蛋白。插入可以是氨基酸的连续区段或由天然存在的多肽中的一个或更多个氨基酸分开。As used herein, "insertion" refers to the modification of a polypeptide by adding one or more amino acids to a reference polypeptide. The insertion may be in an internal portion of the polypeptide or may be an insertion into the carboxyl or amino terminus. Insertions as used herein include fusion proteins as known in the art. Insertions may be continuous stretches of amino acids or separated by one or more amino acids in a naturally occurring polypeptide.

术语“氨基酸取代集”或“取代集”是指与参考序列相比,多肽序列中的一组氨基酸取代。取代集可以具有1个、2个、3个、4个、5个、6个、7个、8个、9个、10个、11个、12个、13个、14个、15个或更多个氨基酸取代。在一些实施方案中,取代集是指在实施例中提供的表格中列出的任何变体尿苷磷酸化酶中存在的氨基酸取代的集合。The term "amino acid substitution set" or "substitution set" refers to a set of amino acid substitutions in a polypeptide sequence compared to a reference sequence. A substitution set can have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more amino acid substitutions. In some embodiments, a substitution set refers to a collection of amino acid substitutions present in any of the variant uridine phosphorylases listed in the tables provided in the Examples.

“功能片段”和“生物活性片段”在本文可互换使用,以指如下多肽:所述多肽具有氨基末端缺失和/或羧基末端缺失和/或内部缺失,但其中剩余的氨基酸序列与和它进行比较的序列(例如,本发明的全长工程化尿苷磷酸化酶)中的对应位置相同,并且保留全长多肽的基本上全部活性。"Functional fragment" and "biologically active fragment" are used interchangeably herein to refer to polypeptides having an amino-terminal deletion and/or a carboxyl-terminal deletion and/or an internal deletion, but wherein the remaining amino acid sequence is identical to the corresponding positions in the sequence to which it is compared (e.g., a full-length engineered uridine phosphorylase of the invention) and retains substantially all of the activity of the full-length polypeptide.

如本文使用的,“分离的多肽”是指与其天然伴随的其他污染物(例如蛋白质、脂质和多核苷酸)基本上分开的多肽。该术语包括已经从它们天然存在的环境或表达系统(例如,宿主细胞内或经由体外合成)中取出或纯化的多肽。重组尿苷磷酸化酶多肽可以存在于细胞内、存在于细胞培养基中,或以各种形式(诸如裂解物或分离的制品)制备。因此,在一些实施方案中,重组尿苷磷酸化酶多肽可以是分离的多肽。As used herein, "isolated polypeptide" refers to a polypeptide that is substantially separated from other contaminants (e.g., proteins, lipids, and polynucleotides) that are naturally associated with it. The term includes polypeptides that have been removed or purified from their naturally occurring environment or expression system (e.g., within a host cell or via in vitro synthesis). Recombinant uridine phosphorylase polypeptides can be present in cells, in cell culture media, or prepared in various forms (such as lysates or isolated preparations). Therefore, in some embodiments, the recombinant uridine phosphorylase polypeptide can be an isolated polypeptide.

如本文使用的,“基本上纯的多肽”或“纯化的蛋白”是指如下组合物,在所述组合物中多肽物质是存在的主要物质(即,在摩尔或重量基础上,它比该组合物中的任何其他单独的大分子物质更丰富),并且当目标物质构成存在的大分子物质的按摩尔或%重量计至少约50%时,通常是基本上纯化的组合物。然而,在一些实施方案中,包含尿苷磷酸化酶的组合物包含少于50%纯的(例如,约10%、约20%、约30%、约40%或约50%)的尿苷磷酸化酶。通常,基本上纯的尿苷磷酸化酶组合物构成该组合物中存在的所有大分子物质的按摩尔或%重量计约60%或更多、约70%或更多、约80%或更多、约90%或更多、约95%或更多以及约98%或更多。在一些实施方案中,将目标物质纯化至基本同质(即,通过常规检测方法不能在组合物中检测出污染物物质),其中该组合物基本上由单一大分子物质组成。溶剂物质、小分子(<500道尔顿)和元素离子物质不被认为是大分子物质。在一些实施方案中,分离的重组尿苷磷酸化酶多肽是基本上纯的多肽组合物。As used herein, "substantially pure polypeptide" or "purified protein" refers to a composition in which the polypeptide species is the predominant species present (i.e., on a molar or weight basis, it is more abundant than any other individual macromolecular species in the composition), and when the target species constitutes at least about 50% of the macromolecular species present by mole or % weight, it is generally a substantially purified composition. However, in some embodiments, the composition comprising uridine phosphorylase comprises less than 50% pure (e.g., about 10%, about 20%, about 30%, about 40% or about 50%) uridine phosphorylase. Typically, a substantially pure uridine phosphorylase composition constitutes about 60% or more, about 70% or more, about 80% or more, about 90% or more, about 95% or more, and about 98% or more of all macromolecular species present in the composition by mole or % weight. In some embodiments, the target species is purified to substantial homogeneity (i.e., contaminant species cannot be detected in the composition by conventional detection methods), wherein the composition consists essentially of a single macromolecular species. Solvent species, small molecules (<500 Daltons), and elemental ionic species are not considered macromolecular species.In some embodiments, the isolated recombinant uridine phosphorylase polypeptide is a substantially pure polypeptide composition.

如本文使用的,“改进的酶性质”是指酶的至少一种改进的性质。在一些实施方案中,本发明提供了与参考尿苷磷酸化酶多肽和/或野生型尿苷磷酸化酶多肽和/或另一种工程化尿苷磷酸化酶多肽相比显示出任何酶性质的改进的工程化尿苷磷酸化酶多肽。因此,“改进”的水平可以在各种尿苷磷酸化酶多肽、包括野生型以及工程化尿苷磷酸化酶之间进行确定和比较。改进的性质包括但不限于诸如以下的性质:增加的蛋白表达、增加的热活性(thermoactivity)、增加的热稳定性、增加的pH活性、增加的稳定性、增加的酶活性、增加的底物特异性或亲和力、增加的比活性、增加的对底物或终产物抑制的抗性、增加的化学稳定性、改进的化学选择性、改进的溶剂稳定性、增加的对酸性pH的耐受性、增加的对蛋白水解活性的耐受性(即,降低的对蛋白水解的敏感性)、降低的聚集、增加的溶解度、和改变的温度谱(temperature profile)。在另外的实施方案中,该术语用于指尿苷磷酸化酶的至少一种改进的性质。在一些实施方案中,本发明提供了与参考尿苷磷酸化酶多肽和/或野生型尿苷磷酸化酶多肽和/或另一种工程化尿苷磷酸化酶多肽相比显示出任何酶性质的改进的工程化尿苷磷酸化酶多肽。因此,“改进”的水平可以在各种尿苷磷酸化酶多肽、包括野生型以及工程化尿苷磷酸化酶之间进行确定和比较。As used herein, "improved enzyme properties" refers to at least one improved property of an enzyme. In some embodiments, the invention provides an engineered uridine phosphorylase polypeptide that shows an improvement in any enzyme property compared to a reference uridine phosphorylase polypeptide and/or a wild-type uridine phosphorylase polypeptide and/or another engineered uridine phosphorylase polypeptide. Therefore, the level of "improvement" can be determined and compared between various uridine phosphorylase polypeptides, including wild-type and engineered uridine phosphorylases. Improved properties include, but are not limited to, properties such as: increased protein expression, increased thermoactivity, increased thermostability, increased pH activity, increased stability, increased enzyme activity, increased substrate specificity or affinity, increased specific activity, increased resistance to substrate or end product inhibition, increased chemical stability, improved chemical selectivity, improved solvent stability, increased tolerance to acidic pH, increased tolerance to proteolytic activity (i.e., reduced sensitivity to proteolysis), reduced aggregation, increased solubility, and changed temperature profiles. In other embodiments, the term is used to refer to at least one improved property of uridine phosphorylase. In some embodiments, the invention provides engineered uridine phosphorylase polypeptides that demonstrate improvements in any enzyme property compared to a reference uridine phosphorylase polypeptide and/or a wild-type uridine phosphorylase polypeptide and/or another engineered uridine phosphorylase polypeptide. Therefore, the level of "improvement" can be determined and compared between various uridine phosphorylase polypeptides, including wild-type and engineered uridine phosphorylases.

如本文使用的,“增加的酶活性”和“增强的催化活性”是指工程化多肽的改进的性质,可以被表示为与参考酶相比,比活性(例如产生的产物/时间/重量蛋白)的增加或将底物转化为产物的转化百分比(例如使用指定量的酶,在指定的时间段内将起始量的底物转化为产物的转化百分比)的增加。在一些实施方案中,这些术语是指本文提供的工程化尿苷磷酸化酶多肽的改进的性质,可以被表示为与参考尿苷磷酸化酶相比,比活性(例如,产生的产物/时间/重量蛋白)的增加或将底物转化为产物的百分比(例如使用指定量的尿苷磷酸化酶,在指定时间段内将起始量的底物转化为产物的转化百分比)的增加。在一些实施方案中,这些术语用于指本文提供的改进的尿苷磷酸化酶。在实施例中提供了确定本发明的工程化尿苷磷酸化酶的酶活性的示例性方法。与酶活性相关的任何性质都可以被影响,包括典型的酶性质Km、Vmax或kcat,其变化可以导致酶活性的增加。例如,酶活性的改进可以是对应野生型酶的酶活性的约1.1倍到相比于天然存在的尿苷磷酸化酶或尿苷磷酸化酶多肽所源自的另一种工程化尿苷磷酸化酶的多达2倍、5倍、10倍、20倍、25倍、50倍、75倍、100倍、150倍、200倍或更大的酶活性。As used herein, "increased enzymatic activity" and "enhanced catalytic activity" refer to the improved properties of engineered polypeptides, which can be expressed as an increase in specific activity (e.g., product produced/time/weight protein) or an increase in the percentage of conversion of substrate to product (e.g., using a specified amount of enzyme, the percentage of conversion of the starting amount of substrate to product within a specified time period) compared to a reference enzyme. In some embodiments, these terms refer to the improved properties of engineered uridine phosphorylase polypeptides provided herein, which can be expressed as an increase in specific activity (e.g., product produced/time/weight protein) or an increase in the percentage of conversion of substrate to product (e.g., using a specified amount of uridine phosphorylase, the percentage of conversion of the starting amount of substrate to product within a specified time period) compared to a reference uridine phosphorylase. In some embodiments, these terms are used to refer to the improved uridine phosphorylase provided herein. Exemplary methods for determining the enzymatic activity of the engineered uridine phosphorylase of the present invention are provided in the Examples. Any property associated with enzymatic activity can be affected, including typical enzyme properties K m , V max or k cat , changes of which can result in an increase in enzymatic activity. For example, the improvement in enzyme activity can be from about 1.1 times the enzyme activity of the corresponding wild-type enzyme to up to 2-fold, 5-fold, 10-fold, 20-fold, 25-fold, 50-fold, 75-fold, 100-fold, 150-fold, 200-fold or more enzyme activity compared to a naturally occurring uridine phosphorylase or another engineered uridine phosphorylase from which the uridine phosphorylase polypeptide is derived.

如本文使用的,“转化”是指将一种或多于一种底物酶促转化(或生物转化)为一种或多于一种对应的产物。“转化百分比”是指在指定条件下在一定时间段内转化为产物的底物的百分比。因此,尿苷磷酸化酶多肽的“酶活性”或“活性”可以表示为在特定时间段内底物转化为产物的“转化百分比”。As used herein, "conversion" refers to the enzymatic conversion (or bioconversion) of one or more substrates into one or more corresponding products. "Conversion percentage" refers to the percentage of substrate converted to product under specified conditions within a certain period of time. Therefore, the "enzyme activity" or "activity" of a uridine phosphorylase polypeptide can be expressed as the "conversion percentage" of substrate to product within a specific period of time.

具有“通用型性质(generalist properties)”的酶(或“通用型酶(generalistenzymes)”)是指与亲本序列相比,对宽范围的底物表现出改进的活性的酶。通用型酶不必对于每种可能的底物都表现出改进的活性。在一些实施方案中,本发明提供了具有通用型性质的尿苷磷酸化酶变体,因为相对于亲本基因,它们对宽范围的空间和电子不同的底物表现出相似或改进的活性。此外,本文提供的通用型酶被工程化为跨越宽范围的有差异的分子被改进以增加代谢物/产物的产生。An enzyme with "generalist properties" (or "generalist enzymes") refers to an enzyme that exhibits improved activity on a wide range of substrates compared to the parent sequence. A generalist enzyme need not exhibit improved activity for every possible substrate. In some embodiments, the invention provides uridine phosphorylase variants with generalist properties because they exhibit similar or improved activity on a wide range of spatially and electronically different substrates relative to the parent gene. In addition, the generalist enzymes provided herein are engineered to be improved across a wide range of differentiated molecules to increase the production of metabolites/products.

术语“严格杂交条件”在本文中用于指在该条件下核酸杂交体是稳定的条件。如本领域技术人员已知的,杂交体的稳定性反映在杂交体的解链温度(Tm)中。通常,杂交体的稳定性随着离子强度、温度、G/C含量和离液剂的存在而变化。多核苷酸的Tm值可以使用用于预测解链温度的已知方法来计算(参见例如Baldino等人,Meth.Enzymol.,168:761-777[1989];Bolton等人,Proc.Natl.Acad.Sci.USA 48:1390[1962];Bresslauer等人,Proc.Natl.Acad.Sci.USA 83:8893-8897[1986];Freier等人,Proc.Natl.Acad.Sci.USA83:9373-9377[1986];Kierzek等人,Biochem.,25:7840-7846[1986];Rychlik等人,Nucl.Acids Res.,18:6409-6412[1990](erratum,Nucl.Acids Res.,19:698[1991]);Sambrook等人,上文);Suggs等人,1981,in Developmental BiologyUsing PurifiedGenes,Brown等人,[eds.],pp.683-693,Academic Press,Cambridge,MA[1981];以及Wetmur,Crit.Rev.Biochem.Mol.Biol.26:227-259[1991])。在一些实施方案中,多核苷酸编码本文公开的多肽,并且在限定的条件下,诸如中度严格或高度严格条件下,与编码本发明的工程化尿苷磷酸化酶的序列的互补序列杂交。The term "stringent hybridization conditions" is used herein to refer to conditions under which nucleic acid hybrids are stable. As known to those skilled in the art, the stability of the hybrid is reflected in the melting temperature ( Tm ) of the hybrid. Typically, the stability of the hybrid varies with ionic strength, temperature, G/C content, and the presence of a chaotropic agent. The Tm value of a polynucleotide can be calculated using known methods for predicting melting temperatures (see, e.g., Baldino et al., Meth. Enzymol., 168:761-777 [1989]; Bolton et al., Proc. Natl. Acad. Sci. USA 48:1390 [1962]; Bresslauer et al., Proc. Natl. Acad. Sci. USA 83:8893-8897 [1986]; Freier et al., Proc. Natl. Acad. Sci. USA 83:9373-9377 [1986]; Kierzek et al., Biochem., 25:7840-7846 [1986]; Rychlik et al., Nucl. Acids Res., 18:6409-6412 [1990] (erratum, Nucl. Acids Res., 19:698 [1991]); Sambrook et al., supra); Suggs et al., 1981, in Developmental Biology Using Purified Genes , Brown et al., [eds.], pp. 683-693, Academic Press, Cambridge, MA [1981]; and Wetmur, Crit. Rev. Biochem. Mol. Biol. 26:227-259 [1991]). In some embodiments, the polynucleotide encodes a polypeptide disclosed herein and hybridizes under defined conditions, such as moderately stringent or highly stringent conditions, to the complement of a sequence encoding an engineered uridine phosphorylase of the invention.

如本文使用的,“杂交严格性”是指核酸杂交中的杂交条件,诸如洗涤条件。通常,杂交反应在较低严格性的条件下进行,随后是不同的但较高严格性的洗涤。术语“中度严格杂交”是指允许靶DNA结合以下互补核酸的条件,所述互补核酸与靶DNA具有约60%同一性,优选地约75%同一性,约85%同一性,与靶多核苷酸具有大于约90%同一性。示例性中度严格条件是等同于在50%甲酰胺、5×Denhart溶液、5×SSPE、0.2%SDS中在42℃杂交,随后在0.2×SSPE、0.2%SDS中在42℃洗涤的条件。“高严格性杂交”通常是指与如对限定的多核苷酸序列在溶液条件下确定的热解链温度Tm相差约10℃或更小的条件。在一些实施方案中,高严格性条件是指仅允许在0.018M NaCl中在65℃形成稳定杂交体的那些核酸序列的杂交的条件(即,如果杂交体在0.018M NaCl中在65℃是不稳定的,它在如本文预期的高严格性条件下将是不稳定的)。可以提供高严格性条件,例如,通过在等同于50%甲酰胺、5×Denhart溶液、5×SSPE、0.2%SDS在42℃的条件杂交,然后在0.1×SSPE和0.1%SDS中在65℃洗涤提供。另一种高严格性条件是在等同于在含有0.1%(w/v)SDS的5X SSC中在65℃杂交的条件进行杂交和在含有0.1%SDS的0.1×SSC中在65℃洗涤。其他高严格性杂交条件以及中度严格条件在上文引用的参考文献中描述。As used herein, "hybridization stringency" refers to hybridization conditions, such as washing conditions, in nucleic acid hybridization. Typically, hybridization reactions are performed under conditions of lower stringency, followed by washings of different but higher stringency. The term "moderate stringency hybridization" refers to conditions that allow target DNA to bind to complementary nucleic acids having about 60% identity, preferably about 75% identity, about 85% identity, and greater than about 90% identity with target polynucleotides. Exemplary moderate stringency conditions are conditions equivalent to hybridization at 42°C in 50% formamide, 5×Denhart solution, 5×SSPE, 0.2%SDS, followed by washing at 42°C in 0.2×SSPE, 0.2%SDS. "High stringency hybridization" generally refers to conditions that differ by about 10°C or less from the thermal melting temperature Tm as determined under solution conditions for a defined polynucleotide sequence. In some embodiments, high stringency conditions refer to conditions that allow hybridization of only those nucleic acid sequences that form stable hybrids in 0.018 M NaCl at 65° C. (i.e., if the hybrid is unstable in 0.018 M NaCl at 65° C., it will be unstable under high stringency conditions as contemplated herein). High stringency conditions can be provided, for example, by hybridizing under conditions equivalent to 50% formamide, 5× Denhart solution, 5× SSPE, 0.2% SDS at 42° C., followed by washing in 0.1× SSPE and 0.1% SDS at 65° C. Another high stringency condition is hybridization under conditions equivalent to hybridization in 5× SSC containing 0.1% (w/v) SDS at 65° C. and washing in 0.1× SSC containing 0.1% SDS at 65° C. Other high stringency hybridization conditions, as well as moderate stringency conditions, are described in the references cited above.

如本文使用的,“密码子优化的”是指编码蛋白的多核苷酸的密码子改变为在特定生物体中优先使用的那些密码子,使得编码的蛋白在感兴趣的生物体中有效地表达。尽管遗传密码是简并的,即大多数氨基酸由被称为“同义”(“synonyms”)或“同义”(“synonymous”)密码子的若干密码子表示,但熟知的是,特定生物体的密码子使用是非随机的和对于特定的密码子三联体是有偏倚的。就给定基因、具有共同功能或祖先起源的基因、高表达的蛋白对比低拷贝数蛋白和生物体的基因组的聚集蛋白编码区而言,这种密码子使用偏倚可能更高。在一些实施方案中,可以对编码尿苷磷酸化酶的多核苷酸进行密码子优化,用于在选择用于表达的宿主生物体中的优化产生。As used herein, "codon optimized" refers to that the codons of the polynucleotides encoding proteins are changed to those codons preferentially used in a particular organism so that the encoded protein is effectively expressed in the organism of interest. Although the genetic code is degenerate, i.e., most amino acids are represented by several codons referred to as "synonyms" or "synonymous" codons, it is well known that the codon usage of a particular organism is non-random and biased for specific codon triplets. With respect to the aggregated protein coding region of a given gene, a gene with a common function or ancestral origin, a high-expression protein contrast low copy number protein, and an organism's genome, this codon usage bias may be higher. In some embodiments, codon optimization can be performed on the polynucleotides encoding uridine phosphorylase for optimized production in a host organism selected for expression.

如本文使用的,“优选的”、“最佳的”、和“高密码子使用偏倚”密码子在单独或组合使用时,可以互换地指在蛋白编码区中的以高于编码相同氨基酸的其他密码子的频率使用的密码子。优选的密码子可以根据单个基因、具有共同功能或起源的一组基因、高表达基因中的密码子使用、整个生物体的聚集蛋白编码区中的密码子频率、相关生物体的聚集蛋白编码区中的密码子频率,或它们的组合来确定。其频率随着基因表达的水平而增加的密码子通常是用于表达的最佳密码子。用于确定特定生物体中密码子频率(例如密码子使用、相对同义密码子使用)和密码子偏好的各种方法是已知的,包括多变量分析,例如使用聚类分析或相关性分析,和基因中使用的密码子的有效数目(参见例如,GCG CodonPreference,Genetics Computer Group Wisconsin Package;CodonW,Peden,UniversityofNottingham;McInerney,Bioinform.,14:372-73[1998];Stenico等人,Nucl.AcidsRes.,222437-46[1994];以及Wright,Gene 87:23-29[1990])。许多不同的生物体的密码子使用表是可用的(参见例如,Wada等人,Nucl.Acids Res.,20:2111-2118[1992];Nakamura等人,Nucl.Acids Res.,28:292[2000];Duret等人,上文;Henaut和Danchin,于EscherichiacoliandSalmonella中,Neidhardt等人.(编著),ASM Press,WashingtonD.C.,第2047-2066页[1996])。用于获得密码子使用的数据源可以依赖于能够编码蛋白的任何可获得的核苷酸序列。这些数据集包括实际已知编码表达的蛋白的核酸序列(例如,完整的蛋白编码序列-CDS)、表达的序列标签(ESTS),或基因组序列的预测编码区(参见例如,Mount,Bioinformatics:SequenceandGenome Analysis,第8章,Cold Spring HarborLaboratory Press,Cold Spring Harbor,N.Y.[2001];Uberbacher,Meth.Enzymol.,266:259-281[1996];以及Tiwari等人,Comput.Appl.Biosci.,13:263-270[1997])。As used herein, "preferred,""optimal," and "high codon usage bias" codons, when used alone or in combination, refer interchangeably to codons in a protein coding region that are used at a higher frequency than other codons encoding the same amino acid. Preferred codons can be determined based on codon usage in a single gene, a group of genes with a common function or origin, highly expressed genes, codon frequency in aggregating protein coding regions of an entire organism, codon frequency in aggregating protein coding regions of related organisms, or a combination thereof. Codons whose frequency increases with the level of gene expression are generally the optimal codons for expression. Various methods for determining codon frequency (e.g., codon usage, relative synonymous codon usage) and codon preference in a particular organism are known, including multivariate analysis, such as using cluster analysis or correlation analysis, and the effective number of codons used in a gene (see, e.g., GCG Codon Preference, Genetics Computer Group Wisconsin Package; Codon W, Peden, University of Nottingham; McInerney, Bioinform., 14:372-73 [1998]; Stenico et al., Nucl. Acids Res., 222437-46 [1994]; and Wright, Gene 87:23-29 [1990]). Codon usage tables for many different organisms are available (see, e.g., Wada et al., Nucl. Acids Res., 20:2111-2118 [1992]; Nakamura et al., Nucl. Acids Res., 28:292 [2000]; Duret et al., supra; Henaut and Danchin, in Escherichiacoli and Salmonella , Neidhardt et al. (Eds.), ASM Press, Washington D.C., pp. 2047-2066 [1996]). The data source for obtaining codon usage can rely on any available nucleotide sequence capable of encoding a protein. These data sets include nucleic acid sequences that are actually known to encode expressed proteins (e.g., complete protein coding sequences - CDS), expressed sequence tags (ESTS), or predicted coding regions of genomic sequences (see, e.g., Mount, Bioinformatics: Sequence and Genome Analysis , Chapter 8, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY [2001]; Uberbacher, Meth. Enzymol., 266:259-281 [1996]; and Tiwari et al., Comput. Appl. Biosci., 13:263-270 [1997]).

如本文使用的,“控制序列”包括对本发明的多核苷酸和/或多肽的表达是必需或有利的所有组分。每一个控制序列对于编码多肽的核酸序列可以是天然的或外来的。这样的控制序列包括但不限于,前导序列、多腺苷酸化序列、前肽序列、启动子序列、信号肽序列、起始序列和转录终止子。在最小程度上,控制序列包括启动子和转录及翻译终止信号。控制序列可以与接头一起被提供,以用于导入促进控制序列与编码多肽的核酸序列的编码区域的连接的特定限制性位点的目的。As used herein, "control sequences" include all components that are necessary or advantageous for the expression of the polynucleotides and/or polypeptides of the present invention. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences include, but are not limited to, leader sequences, polyadenylation sequences, propeptide sequences, promoter sequences, signal peptide sequences, start sequences, and transcription terminators. At a minimum, control sequences include promoters and transcription and translation termination signals. Control sequences may be provided with a linker for the purpose of introducing specific restriction sites that facilitate the connection of the control sequence to the coding region of the nucleic acid sequence encoding the polypeptide.

“可操作地连接”在本文被定义为如下配置:在所述配置中控制序列适当地放置(即,以功能关系)在相对于感兴趣的多核苷酸的位置处,使得控制序列指导或调节感兴趣的多核苷酸和/或多肽的表达。"Operably linked" is defined herein as a configuration in which a control sequence is appropriately placed (ie, in a functional relationship) relative to a polynucleotide of interest such that the control sequence directs or regulates expression of the polynucleotide and/or polypeptide of interest.

“启动子序列”指被宿主细胞识别用于感兴趣的多核苷酸诸如编码序列的表达的核酸序列。启动子序列包括介导感兴趣的多核苷酸的表达的转录控制序列。启动子可以是在选择的宿主细胞中显示转录活性的任何核酸序列,包括突变、截短的和杂合启动子,并且可以从编码与宿主细胞同源或异源的细胞外或细胞内多肽的基因来获得。"Promoter sequence" refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence. The promoter sequence includes transcriptional control sequences that mediate expression of the polynucleotide of interest. The promoter can be any nucleic acid sequence that shows transcriptional activity in a selected host cell, including mutant, truncated and hybrid promoters, and can be obtained from genes encoding extracellular or intracellular polypeptides that are homologous or heterologous to the host cell.

短语“合适的反应条件”是指在酶促转化反应溶液中的那些条件(例如,酶载量(enzyme loading)、底物载量、温度、pH、缓冲液、助溶剂等的范围),在所述条件下本发明的尿苷磷酸化酶多肽能够将底物转化为期望的产物化合物。一些示例性的“合适的反应条件”在本文中提供。The phrase "suitable reaction conditions" refers to those conditions in the enzymatic conversion reaction solution (e.g., ranges of enzyme loading, substrate loading, temperature, pH, buffer, cosolvents, etc.) under which the uridine phosphorylase polypeptide of the invention is able to convert the substrate into a desired product compound. Some exemplary "suitable reaction conditions" are provided herein.

如本文使用的,“载量”,诸如在“化合物载量”或“酶载量”中,是指在反应起始时组分在反应混合物中的浓度或量。As used herein, "loading," such as in "compound loading" or "enzyme loading," refers to the concentration or amount of a component in the reaction mixture at the start of the reaction.

如本文使用的,在酶促转化反应过程的情况下,“底物”是指由本文提供的工程化酶(例如工程化尿苷磷酸化酶多肽)作用的化合物或分子。As used herein, in the context of an enzymatic conversion reaction process, "substrate" refers to a compound or molecule that is acted upon by an engineered enzyme (eg, an engineered uridine phosphorylase polypeptide) provided herein.

如本文使用的,由反应产生的产物(例如脱氧核糖磷酸类似物)的“增加”的产率发生在:与相同条件下用相同底物和其他取代物,但不存在感兴趣的组分的情况下进行的反应相比,反应期间存在的特定组分(例如尿苷磷酸化酶)导致产生更多的产物时。As used herein, an "increased" yield of a product (e.g., a deoxyribose phosphate analog) produced by a reaction occurs when the presence of a particular component (e.g., uridine phosphorylase) during the reaction results in the production of more product compared to a reaction performed under the same conditions with the same substrate and other substituents, but in the absence of the component of interest.

如果与参与催化反应的其他酶相比,特定酶的量少于约2%、约1%、或约0.1%(wt/wt),则该反应被称为“基本上不含”该酶。A reaction is said to be "substantially free" of a particular enzyme if the amount of that enzyme is less than about 2%, about 1%, or about 0.1% (wt/wt) compared to other enzymes involved in catalyzing the reaction.

如本文使用的,使液体(例如,培养肉汤)“分级分离”意指应用分离过程(例如,盐沉淀、柱色谱、尺寸排阻和过滤)或这些过程的组合以提供这样的溶液:其中期望的蛋白占溶液中的总蛋白的百分比比在初始液体产物中的更大。As used herein, "fractionating" a liquid (e.g., culture broth) means applying a separation process (e.g., salt precipitation, column chromatography, size exclusion, and filtration) or a combination of these processes to provide a solution in which the desired protein constitutes a greater percentage of the total protein in the solution than in the initial liquid product.

如本文使用的,“起始组合物”是指包含至少一种底物的任何组合物。在一些实施方案中,起始组合物包含任何合适的底物。As used herein, "starting composition" refers to any composition comprising at least one substrate. In some embodiments, the starting composition comprises any suitable substrate.

如本文使用的,在酶促转化过程的上下文中的“产物”是指从酶多肽对底物的作用产生的化合物或分子。As used herein, "product" in the context of an enzymatic conversion process refers to a compound or molecule resulting from the action of an enzyme polypeptide on a substrate.

如本文使用的,“平衡”如本文使用的是指在化学或酶促反应(例如,两种物质A和B的相互转化)中导致化学物质稳定状态浓度的过程,包括立体异构体的相互转化,由化学或酶促反应的正向速率常数和反向速率常数确定。As used herein, "equilibrium" as used herein refers to the process that leads to a steady-state concentration of a chemical species in a chemical or enzymatic reaction (e.g., the interconversion of two species A and B), including the interconversion of stereoisomers, as determined by the forward rate constant and the reverse rate constant of the chemical or enzymatic reaction.

如本文使用的,“烷基(alkyl)”是指具有从1至18个碳原子(包括端点)的,直链的或支链的,更优选地从1个至8个碳原子(包括端点),并且最优选地1个至6个碳原子(包括端点)的饱和烃基团。具有指定数目的碳原子的烷基在括号中表示(例如(C1-C4)烷基是指1个至4个碳原子的烷基)。As used herein, "alkyl" refers to a saturated hydrocarbon group having from 1 to 18 carbon atoms (inclusive), linear or branched, more preferably from 1 to 8 carbon atoms (inclusive), and most preferably 1 to 6 carbon atoms (inclusive). Alkyl groups having a specified number of carbon atoms are indicated in brackets (e.g., (C1-C4) alkyl refers to alkyl groups of 1 to 4 carbon atoms).

如本文使用的,“烯基”是指具有从2个至12个碳原子(包括端点)的、直链或支链的、包含至少一个双键但任选地包含多于一个双键的基团。As used herein, "alkenyl" refers to a group having from 2 to 12 carbon atoms (inclusive), linear or branched, containing at least one double bond but optionally containing more than one double bond.

如本文使用的,“炔基”是指具有从2个至12个碳原子(包括端点)的、直链或支链的、包含至少一个三键但任选地包含多于一个三键,并且另外任选地包含一个或更多个双键键合部分的基团。As used herein, "alkynyl" refers to a group having from 2 to 12 carbon atoms (inclusive), straight or branched, containing at least one triple bond but optionally containing more than one triple bond, and further optionally containing one or more double bonded moieties.

如本文使用的,“杂烷基”、“杂烯基”和“杂炔基”是指其中一个或更多个碳原子各自独立地被相同或不同的杂原子或杂原子基团替代的如本文定义的烷基、烯基和炔基。可以替代碳原子的杂原子和/或杂原子基团包括但不限于-O-、-S-、-S-O-、-NRα-、-PH-、-S(O)-、-S(O)2-、-S(O)NRα-、-S(O)2NRα-等,包括其组合,其中每个Rα独立地选自氢、烷基、杂烷基、环烷基、杂环烃基、芳基和杂芳基。As used herein, "heteroalkyl", "heteroalkenyl" and "heteroalkynyl" refer to alkyl, alkenyl and alkynyl groups as defined herein in which one or more carbon atoms are each independently replaced by the same or different heteroatoms or heteroatom groups. The heteroatoms and/or heteroatom groups that may replace carbon atoms include, but are not limited to, -O-, -S-, -SO-, -NRα-, -PH-, -S(O)-, -S(O) 2 -, -S(O)NRα-, -S(O) 2 NRα-, and the like, including combinations thereof, wherein each Rα is independently selected from hydrogen, alkyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl and heteroaryl.

如本文使用的,“烷氧基”是指基团-ORβ,其中Rβ是如上文定义的烷基基团,包括还如本文定义的任选地被取代的烷基基团。As used herein, "alkoxy" refers to a group -OR[beta], wherein R[beta] is an alkyl group as defined above, including optionally substituted alkyl groups also as defined herein.

如本文使用的,“芳基”是指具有单环(例如,苯基)或多于一个稠环(例如,萘基或蒽基)的具有从6个至12个碳原子(包括端点)的不饱和芳族碳环基团。示例性芳基包括苯基、吡啶基、萘基等。As used herein, "aryl" refers to an unsaturated aromatic carbocyclic group having from 6 to 12 carbon atoms (including endpoints) having a single ring (e.g., phenyl) or more than one condensed ring (e.g., naphthyl or anthracenyl). Exemplary aryl groups include phenyl, pyridyl, naphthyl, etc.

如本文使用的,“氨基”是指基团-NH2。被取代的氨基是指基团-NHRδ、NRδRδ和NRδRδRδ,其中每个Rδ独立选自被取代的或未被取代的烷基、环烷基、环杂烷基、烷氧基、芳基、杂芳基、杂芳基烷基、酰基、烷氧基羰基、硫烷基、亚硫酰基、磺酰基等。典型的氨基基团包括但不限于二甲基氨基、二乙基氨基、三甲基铵、三乙基铵、甲基磺酰基氨基、呋喃基-氧基-磺氨基等。As used herein, "amino" refers to the group -NH2. Substituted amino refers to the groups -NHRδ, NRδRδ and NRδRδRδ, where each Rδ is independently selected from substituted or unsubstituted alkyl, cycloalkyl, cycloheteroalkyl, alkoxy, aryl, heteroaryl, heteroarylalkyl, acyl, alkoxycarbonyl, sulfanyl, sulfinyl, sulfonyl, etc. Typical amino groups include, but are not limited to, dimethylamino, diethylamino, trimethylammonium, triethylammonium, methylsulfonylamino, furanyl-oxy-sulfonylamino, etc.

如本文使用的,“氧/氧代(oxo)”是指=O。As used herein, "oxygen/oxo" refers to =0.

如本文使用的,“氧基”是指二价基团-O-,其可以具有各种取代基以形成不同的氧基基团,包括醚和酯。As used herein, "oxy" refers to a divalent group -O-, which may have various substituents to form different oxy groups, including ethers and esters.

如本文使用的,“羧基”是指-COOH。As used herein, "carboxyl" refers to -COOH.

如本文使用的,“羰基”是指-C(O)-,其可以具有各种取代基以形成不同的羰基基团,包括酸、酸性卤化物、醛、酰胺、酯和酮。As used herein, "carbonyl" refers to -C(O)-, which can have various substituents to form different carbonyl groups including acids, acid halides, aldehydes, amides, esters, and ketones.

如本文使用的,“烷基氧基羰基”是指-C(O)ORε,其中Rε是如本文定义的烷基基团,其可以被任选地取代。As used herein, "alkyloxycarbonyl" refers to -C(O)ORε, wherein Rε is an alkyl group as defined herein, which may be optionally substituted.

如本文使用的,“氨基羰基”是指-C(O)NH2。被取代的氨基羰基是指-C(O)NRδRδ,其中氨基基团NRδRδ是如本文定义的。As used herein, "aminocarbonyl" refers to -C(O)NH 2 . Substituted aminocarbonyl refers to -C(O)NRδRδ, wherein the amino group NRδRδ is as defined herein.

如本文使用的,“卤素(halogen)”和“卤代(halo)”是指氟、氯、溴和碘。As used herein, "halogen" and "halo" refer to fluorine, chlorine, bromine and iodine.

如本文使用的,“羟基”是指-OH。As used herein, "hydroxy" refers to -OH.

如本文使用的,“氰基”是指-CN。As used herein, "cyano" refers to -CN.

如本文使用的,“杂芳基”是指在环内具有1至10个碳原子(包括端点)和1至4个选自氧、氮和硫的杂原子(包括端点)的芳族杂环基团。这样的杂芳基基团可以具有单环(例如,吡啶基或呋喃基)或多于一个稠环(例如,吲嗪基(indolizinyl)或苯并噻吩基)。As used herein, "heteroaryl" refers to an aromatic heterocyclic group having 1 to 10 carbon atoms (inclusive) and 1 to 4 heteroatoms (inclusive) selected from oxygen, nitrogen and sulfur in the ring. Such heteroaryl groups may have a single ring (e.g., pyridyl or furanyl) or more than one condensed ring (e.g., indolizinyl or benzothienyl).

如本文使用的,“杂芳基烷基”是指被杂芳基取代的烷基(即,杂芳基-烷基-基团),优选地在烷基部分中具有从1个至6个碳原子(包括端点)并且在杂芳基部分中具有从5个至12个环原子(包括端点)。这样的杂芳基烷基基团的实例是吡啶基甲基等。As used herein, "heteroarylalkyl" refers to an alkyl group substituted with a heteroaryl group (i.e., a heteroaryl-alkyl-group), preferably having from 1 to 6 carbon atoms (inclusive) in the alkyl portion and from 5 to 12 ring atoms (inclusive) in the heteroaryl portion. Examples of such heteroarylalkyl groups are pyridylmethyl and the like.

如本文使用的,“杂芳基烯基”是指被杂芳基取代的烯基(即,杂芳基-烯基-基团),优选地在烯基部分中具有从2个至6个碳原子(包括端点)并且在杂芳基部分中具有从5个至12个环原子(包括端点)。As used herein, "heteroarylalkenyl" refers to an alkenyl group substituted with a heteroaryl group (i.e., a heteroaryl-alkenyl- group), preferably having from 2 to 6 carbon atoms (inclusive) in the alkenyl portion and from 5 to 12 ring atoms (inclusive) in the heteroaryl portion.

如本文使用的,“杂芳基炔基”是指被杂芳基取代的炔基(即,杂芳基-炔基-基团),优选地在炔基部分中具有从2个至6个碳原子(包括端点)并且在杂芳基部分中具有从5个至12个环原子(包括端点)。As used herein, "heteroarylalkynyl" refers to an alkynyl group substituted with a heteroaryl group (i.e., a heteroaryl-alkynyl-group), preferably having from 2 to 6 carbon atoms (inclusive) in the alkynyl portion and from 5 to 12 ring atoms (inclusive) in the heteroaryl portion.

如本文使用的,“杂环”、“杂环的”和可互换的“杂环烃基(heterocycloalkyl)”是指具有单环或多于一个稠环的、具有从2个至10个碳环原子(包括端点)和在环内的选自氮、硫或氧的从1个至4个杂环原子(包括端点)的饱和的或不饱和基团。这样的杂环基团可以具有单环(例如,哌啶基或四氢呋喃基)或多于一个稠环(例如,二氢吲哚基、二氢苯并呋喃或奎宁环基(quinuclidinyl))。杂环的实例包括但不限于呋喃、噻吩、噻唑、噁唑、吡咯、咪唑、吡唑、吡啶、吡嗪、嘧啶、哒嗪、吲嗪、异吲哚、吲哚、吲唑、嘌呤、喹嗪(quinolizine)、异喹啉、喹啉、酞嗪(phthalazine)、萘基吡啶、喹喔啉、喹唑啉、噌啉、蝶啶、咔唑(carbazole)、咔啉(carboline)、菲啶(phenanthridine)、吖啶、菲咯啉(phenanthroline)、异噻唑、吩嗪(phenazine)、异噁唑、吩噁嗪(phenoxazine)、吩噻嗪(phenothiazine)、四氢咪唑(imidazolidine)、咪唑啉(imidazoline)、哌啶、哌嗪、吡咯烷、二氢吲哚等。As used herein, "heterocycle", "heterocyclic" and interchangeably "heterocycloalkyl" refer to saturated or unsaturated groups having a single ring or more than one fused ring, having from 2 to 10 carbon ring atoms (inclusive) and from 1 to 4 heterocyclic atoms (inclusive) selected from nitrogen, sulfur or oxygen within the ring. Such heterocyclic groups can have a single ring (e.g., piperidinyl or tetrahydrofuranyl) or more than one fused ring (e.g., indolinyl, dihydrobenzofuran or quinuclidinyl). Examples of heterocycles include, but are not limited to, furan, thiophene, thiazole, oxazole, pyrrole, imidazole, pyrazole, pyridine, pyrazine, pyrimidine, pyridazine, indolizine, isoindole, indole, indazole, purine, quinolizine, isoquinoline, quinoline, phthalazine, naphthylpyridine, quinoxaline, quinazoline, cinnoline, pteridine, carbazole, carboline, phenanthridine, acridine, phenanthroline, isothiazole, phenazine, isoxazole, phenoxazine, phenothiazine, imidazolidine, imidazoline, piperidine, piperazine, pyrrolidine, indoline, and the like.

如本文使用的,“元环”意图涵盖任何环状结构。术语“元”之前的数字表示构成环的主链原子的数目。因此,例如环己基、吡啶、吡喃和噻喃是6元环,并且环戊基、吡咯、呋喃和噻吩是5元环。As used herein, "membered ring" is intended to encompass any cyclic structure. The number before the term "membered" represents the number of backbone atoms that make up the ring. Thus, for example, cyclohexyl, pyridine, pyrans, and thiopyrans are 6-membered rings, and cyclopentyl, pyrrole, furan, and thiophene are 5-membered rings.

除非另有指定,否则在前述基团中被氢占据的位置可以用以下取代基进一步取代,所述取代基例如但不限于:羟基、氧代、硝基、甲氧基、乙氧基、烷氧基、被取代的烷氧基、三氟甲氧基、卤代烷氧基、氟、氯、溴、碘、卤代、甲基、乙基、丙基、丁基、烷基、烯基、炔基、被取代的烷基、三氟甲基、卤代烷基、羟基烷基、烷氧基烷基、硫基、烷硫基、酰基、羧基、烷氧基羰基、甲酰氨基、被取代的甲酰氨基、烷基磺酰基、烷基亚磺酰基、烷基磺酰基氨基、磺酰氨基、被取代的磺酰氨基、氰基、氨基、被取代的氨基、烷基氨基、二烷基氨基、氨基烷基、酰基氨基、脒基、脒肟基(amidoximo)、羟基甲酰基(hydroxamoyl)、苯基、芳基、被取代的芳基、芳氧基、芳基烷基、芳基烯基、芳基炔基、吡啶基、咪唑基、杂芳基、被取代的杂芳基、杂芳氧基、杂芳基烷基、杂芳基烯基、杂芳基炔基、环丙基、环丁基、环戊基、环己基、环烷基、环烯基、环烷基烷基、被取代的环烷基、环烷基氧基、吡咯烷基、哌啶基、吗啉代、杂环、(杂环)氧基和(杂环)烷基;并且优选的杂原子是氧、氮和硫。应理解,当在这些取代基上存在开放化合价时,它们可以进一步被烷基、环烷基、芳基、杂芳基和/或杂环基团取代,当碳上存在这些开放化合价时,它们可以进一步被卤素和被氧-、氮-或硫-键合的取代基取代,并且当存在多于一个这样的开放化合价时,这些基团可以通过直接形成键或通过与新的杂原子(优选地,氧、氮或硫)键合形成键而连接以形成环。还应理解,可以进行上文的取代,条件是用取代基替代氢不会对本发明的分子带来不可接受的不稳定性,并且在其他方面在化学上是合理的。Unless otherwise specified, the positions occupied by hydrogen in the foregoing groups may be further substituted with substituents such as, but not limited to, hydroxy, oxo, nitro, methoxy, ethoxy, alkoxy, substituted alkoxy, trifluoromethoxy, haloalkoxy, fluoro, chloro, bromo, iodo, halo, methyl, ethyl, propyl, butyl, alkyl, alkenyl, alkynyl, substituted alkyl, trifluoromethyl, haloalkyl, hydroxyalkyl, alkoxyalkyl, thio, alkylthio, acyl, carboxyl, alkoxycarbonyl, formylamino, substituted formylamino, alkylsulfonyl, alkylsulfinyl, alkylsulfonylamino, sulfonylamino, substituted sulfonylamino, cyano, amino, substituted The preferred heteroatoms are amino, alkylamino, dialkylamino, aminoalkyl, acylamino, amidino, amidoximo, hydroxamoyl, phenyl, aryl, substituted aryl, aryloxy, arylalkyl, arylalkenyl, arylalkynyl, pyridyl, imidazolyl, heteroaryl, substituted heteroaryl, heteroaryloxy, heteroarylalkyl, heteroarylalkenyl, heteroarylalkynyl, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloalkyl, cycloalkenyl, cycloalkylalkyl, substituted cycloalkyl, cycloalkyloxy, pyrrolidinyl, piperidinyl, morpholino, heterocycle, (heterocycle)oxy and (heterocycle)alkyl; and preferred heteroatoms are oxygen, nitrogen and sulfur. It is understood that when open valences are present on these substituents, they may be further substituted with alkyl, cycloalkyl, aryl, heteroaryl and/or heterocyclic groups, when these open valences are present on carbon, they may be further substituted with halogens and oxygen-, nitrogen- or sulfur-bonded substituents, and when more than one such open valence is present, these groups may be connected to form a ring by direct bond formation or by bonding to new heteroatoms (preferably, oxygen, nitrogen or sulfur). It is also understood that the above substitutions may be made provided that replacement of hydrogen with a substituent does not introduce unacceptable instability into the molecules of the invention and is otherwise chemically reasonable.

如本文使用的术语“培养”指微生物细胞群体在任何合适的条件(例如,使用液体、凝胶或固体培养基)下的生长。As used herein, the term "culturing" refers to the growth of a population of microbial cells under any suitable conditions (eg, using liquid, gel, or solid culture media).

重组多肽可以使用本领域已知的任何合适的方法产生。可以将编码感兴趣的野生型多肽的基因克隆到载体诸如质粒中,并且在期望的宿主诸如大肠杆菌等中表达。重组多肽的变体可以通过本领域已知的各种方法产生。事实上,存在本领域技术人员熟知的各种各样不同的诱变技术。此外,诱变试剂盒还可从许多商业分子生物学供应商获得。方法可用于做出确定的氨基酸(定点)处的特定取代、基因的局部区域中的特异性(区域特异性)突变或随机突变,或整个基因内的随机诱变(例如,饱和诱变)。本领域的技术人员已知产生酶变体的许多合适的方法,包括但不限于,使用PCR对单链DNA或双链DNA定点诱变、盒式诱变、基因合成、易错PCR、改组、和化学饱和诱变,或本领域已知的任何其他合适的方法。诱变和定向演化方法可以容易地应用于编码酶的多核苷酸,以产生可以被表达、筛选和测定的变体文库。任何合适的诱变和定向演化方法可用于本发明并且是本领域中熟知的(参见例如美国专利第5,605,793号、第5,811,238号、第5,830,721号、第5,834,252号、第5,837,458号、第5,928,905号、第6,096,548号、第6,117,679号、第6,132,970号、第6,165,793号、第6,180,406号、第6,251,674号、第6,265,201号、第6,277,638号、第6,287,861号、第6,287,862号、第6,291,242号、第6,297,053号、第6,303,344号、第6,309,883号、第6,319,713号、第6,319,714号、第6,323,030号、第6,326,204号、第6,335,160号、第6,335,198号、第6,344,356号、第6,352,859号、第6,355,484号、第6,358,740号、第6,358,742号、第6,365,377号、第6,365,408号、第6,368,861号、第6,372,497号、第6,337,186号、第6,376,246号、第6,379,964号、第6,387,702号、第6,391,552号、第6,391,640号、第6,395,547号、第6,406,855号、第6,406,910号、第6,413,745号、第6,413,774号、第6,420,175号、第6,423,542号、第6,426,224号、第6,436,675号、第6,444,468号、第6,455,253号、第6,479,652号、第6,482,647号、第6,483,011号、第6,484,105号、第6,489,146号、第6,500,617号、第6,500,639号、第6,506,602号、第6,506,603号、第6,518,065号、第6,519,065号、第6,521,453号、第6,528,311号、第6,537,746号、第6,573,098号、第6,576,467号、第6,579,678号、第6,586,182号、第6,602,986号、第6,605,430号、第6,613,514号、第6,653,072号、第6,686,515号、第6,703,240号、第6,716,631号、第6,825,001号、第6,902,922号、第6,917,882号、第6,946,296号、第6,961,664号、第6,995,017号、第7,024,312号、第7,058,515号、第7,105,297号、第7,148,054号、第7,220,566号、第7,288,375号、第7,384,387号、第7,421,347号、第7,430,477号、第7,462,469号、第7,534,564号、第7,620,500号、第7,620,502号、第7,629,170号、第7,702,464号、第7,747,391号、第7,747,393号、第7,751,986号、第7,776,598号、第7,783,428号、第7,795,030号、第7,853,410号、第7,868,138号、第7,783,428号、第7,873,477号、第7,873,499号、第7,904,249号、第7,957,912号、第7,981,614号、第8,014,961号、第8,029,988号、第8,048,674号、第8,058,001号、第8,076,138号、第8,108,150号、第8,170,806号、第8,224,580号、第8,377,681号、第8,383,346号、第8,457,903号、第8,504,498号、第8,589,085号、第8,762,066号、第8,768,871号、第9,593,326号、第9,665,694号、第9,684,771号,和所有相关的美国以及PCT和非美国对应专利;Ling等人,Anal.Biochem.,254(2):157-78[1997];Dale等人,Meth.Mol.Biol.,57:369-74[1996];Smith,Ann.Rev.Genet.,19:423-462[1985];Botstein等人,Science,229:1193-1201[1985];Carter,Biochem.J.,237:1-7[1986];Kramer等人,Cell,38:879-887[1984];Wells等人,Gene,34:315-323[1985];Minshull等人,Curr.Op.Chem.Biol.,3:284-290[1999];Christians等人,Nat.Biotechnol.,17:259-264[1999];Crameri等人,Nature,391:288-291[1998];Crameri,等人,Nat.Biotechnol.,15:436-438[1997];Zhang等人,Proc.Nat.Acad.Sci.U.S.A.,94:4504-4509[1997];Crameri等人,Nat.Biotechnol.,14:315-319[1996];Stemmer,Nature,370:389-391[1994];Stemmer,Proc.Nat.Acad.Sci.USA,91:10747-10751[1994];WO 95/22625;WO 97/0078;WO 97/35966;WO 98/27230;WO 00/42651;WO 01/75767;和WO 2009/152336,其全部通过引用并入本文)。Recombinant polypeptides can be produced using any suitable method known in the art. The gene encoding the wild-type polypeptide of interest can be cloned into a vector such as a plasmid and expressed in a desired host such as Escherichia coli. Variants of recombinant polypeptides can be produced by various methods known in the art. In fact, there are various different mutagenesis techniques well known to those skilled in the art. In addition, mutagenesis kits can also be obtained from many commercial molecular biology suppliers. The method can be used to make specific substitutions at a determined amino acid (site-directed), specific (region-specific) mutations or random mutations in a local region of a gene, or random mutagenesis (e.g., saturation mutagenesis) in the entire gene. Many suitable methods for producing enzyme variants are known to those skilled in the art, including but not limited to, using PCR to single-stranded DNA or double-stranded DNA site-directed mutagenesis, cassette mutagenesis, gene synthesis, error-prone PCR, reorganization and chemical saturation mutagenesis, or any other suitable method known in the art. Mutagenesis and directed evolution methods can be easily applied to polynucleotides encoding enzymes to produce variant libraries that can be expressed, screened and measured. Any suitable mutagenesis and directed evolution methods can be used in the present invention and are well known in the art (see, e.g., U.S. Pat. Nos. 5,605,793, 5,811,238, 5,830,721, 5,834,252, 5,837,458, 5,928,905, 6,096,548, 6,117,679, 6, No. 132,970, No. 6,165,793, No. 6,180,406, No. 6,251,674, No. 6,265,201, No. 6,277,638, No. 6,287,861, No. 6,287,862, No. 6,291,242, No. 6,297,053, No. 6,303,344, No. 6,309,883 No. 6,319,713, No. 6,319,714, No. 6,323,030, No. 6,326,204, No. 6,335,160, No. 6,335,198, No. 6,344,356, No. 6,352,859, No. 6,355,484, No. 6,358,740, No. 6,358,742, No. 6,36 No. 5,377, No. 6,365,408, No. 6,368,861, No. 6,372,497, No. 6,337,186, No. 6,376,246, No. 6,379,964, No. 6,387,702, No. 6,391,552, No. 6,391,640, No. 6,395,547, No. 6,406,855, No. 6,406,910, No. 6,413,745, No. 6,413,774, No. 6,420,175, No. 6,423,542, No. 6,426,224, No. 6,436,675, No. 6,444,468, No. 6,455,253, No. 6,479,652, No. 6,482,647, No. 6,483, 011, No. 6,484,105, No. 6,489,146, No. 6,500,617, No. 6,500,639, No. 6,506,602, No. 6,506,603, No. 6,518,065, No. 6,519,065, No. 6,521,453, No. 6,528,311, No. 6,537,746, No. 6 ,573,098, No. 6,576,467, No. 6,579,678, No. 6,586,182, No. 6,602,986, No. 6,605,430, No. 6,613,514, No. 6,653,072, No. 6,686,515, No. 6,703,240, No. 6,716,631, No. 6,825,00 No. 1, No. 6,902,922, No. 6,917,882, No. 6,946,296, No. 6,961,664, No. 6,995,017, No. 7,024,312, No. 7,058,515, No. 7,105,297, No. 7,148,054, No. 7,220,566, No. 7,288,375, No. 7,3 No. 84,387, No. 7,421,347, No. 7,430,477, No. 7,462,469, No. 7,534,564, No. 7,620,500, No. 7,620,502, No. 7,629,170, No. 7,702,464, No. 7,747,391, No. 7,747,393, No. 7,751,986 No. 7,776,598, No. 7,783,428, No. 7,795,030, No. 7,853,410, No. 7,868,138, No. 7,783,428, No. 7,873,477, No. 7,873,499, No. 7,904,249, No. 7,957,912, No. 7,981,614, No. 8,01 No. 4,961, No. 8,029,988, No. 8,048,674, No. 8,058,001, No. 8,076,138, No. 8,108,150, No. 8,170,806, No. 8,224,580, No. 8,377,681, No. 8,383,346, No. 8,457,903, No. 8,504,498, Nos. 8,589,085, 8,762,066, 8,768,871, 9,593,326, 9,665,694, 9,684,771, and all related U.S. and PCT and non-U.S. counterparts; Ling et al., Anal. Biochem., 254(2):157-78 [1997]; Dale ... et al., Meth. Mol. Biol., 57:369-74 [1996]; Smith, Ann. Rev. Genet., 19:423-462 [1985]; Botstein et al., Science, 229:1193-1201 [1985]; Carter, Biochem. J., 237:1-7 [1986]; Kramer et al., Cell, 38:879-887 [1984]; Wells et al., Gene, 34:315-323 [1985]; Minshull et al., Curr. Op. Chem. Biol., 3:284-290 [1999]; Christians et al., Nat. Biotechnol. , 17:259-264 [1999]; Crameri et al., Nature, 391:288-291 [1998]; Crameri, et al., Nat. Biotechnol., 15:436-438 [1997]; Zhang et al., Proc. Nat. Acad. Sci. U.S.A., 94:4504-4509 [1997]; Crameri et al., Nat. Biotechnol., 14:315-319 [1996]; Stemmer, Nature, 370:389-391 [1994]; Stemmer, Proc. Nat. Acad. Sci. USA, 91:10747-10751 [1994]; WO 95/22625; WO 97/0078; WO 97/35966; WO 98/27230; WO 00/42651; WO 01/75767; and WO 2009/152336, all of which are incorporated herein by reference).

在一些实施方案中,诱变处理后获得的酶克隆通过使酶制品经受指定的温度(或其他测定条件)并测量热处理或其他合适的测定条件后剩余的酶活性的量进行筛选。然后从基因分离含有编码多肽的多核苷酸的克隆,将其测序以鉴定核苷酸序列变化(如果有),并且用于在宿主细胞中表达酶。测量来自表达文库的酶活性可以使用本领域已知的任何合适的方法(例如,标准生物化学技术,诸如HPLC分析)来进行。In some embodiments, the enzyme clones obtained after mutagenesis are screened by subjecting the enzyme preparation to a specified temperature (or other assay conditions) and measuring the amount of enzyme activity remaining after heat treatment or other suitable assay conditions. Clones containing polynucleotides encoding polypeptides are then isolated from the gene, sequenced to identify nucleotide sequence changes (if any), and used to express the enzyme in a host cell. Measuring enzyme activity from an expression library can be performed using any suitable method known in the art (e.g., standard biochemical techniques, such as HPLC analysis).

产生变体后,可以对它们筛选任何期望的性质(例如,高或增加的活性、或者低或减少的活性、增加的热活性、增加的热稳定性和/或酸性pH稳定性等)。在一些实施方案中,可使用“重组尿苷磷酸化酶多肽”(在本文也称为“工程化尿苷磷酸化酶多肽”、“变体尿苷磷酸化酶”、“尿苷磷酸化酶变体”和“尿苷磷酸化酶组合变体”)。Once variants are generated, they can be screened for any desired property (e.g., high or increased activity, or low or decreased activity, increased thermal activity, increased thermal stability and/or acidic pH stability, etc.). In some embodiments, "recombinant uridine phosphorylase polypeptides" (also referred to herein as "engineered uridine phosphorylase polypeptides," "variant uridine phosphorylases," "uridine phosphorylase variants," and "uridine phosphorylase combinatorial variants") can be used.

如本文使用的,“载体”为用于将DNA序列导入到细胞中的DNA构建体。在一些实施方案中,载体为被可操作地连接至能够实现DNA序列中编码的多肽在合适宿主中的表达的合适的控制序列的表达载体。在一些实施方案中,“表达载体”具有可操作地连接至DNA序列(例如,转基因)以驱动在宿主细胞中表达的启动子序列,并且在一些实施方案中,还包含转录终止子序列。As used herein, "vector" is a DNA construct for importing a DNA sequence into a cell. In some embodiments, a vector is an expression vector operably linked to a suitable control sequence capable of achieving expression of a polypeptide encoded in a DNA sequence in a suitable host. In some embodiments, an "expression vector" has a promoter sequence operably linked to a DNA sequence (e.g., a transgenic) to drive expression in a host cell, and in some embodiments, also comprises a transcription terminator sequence.

如本文使用的,术语“表达”包括参与多肽产生的任何步骤,包括但不限于,转录、转录后修饰、翻译和翻译后修饰。在一些实施方案中,该术语还包括多肽从细胞的分泌。As used herein, the term "expression" includes any step involved in the production of a polypeptide, including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also includes secretion of the polypeptide from a cell.

如本文使用的,术语“产生”指蛋白和/或其他化合物由细胞的产生。意图是,该术语包括参与多肽产生的任何步骤,包括但不限于,转录、转录后修饰、翻译和翻译后修饰。在一些实施方案中,该术语还包括多肽从细胞的分泌。As used herein, the term "production" refers to the production of proteins and/or other compounds by cells. It is intended that the term includes any step involved in the production of a polypeptide, including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also includes secretion of a polypeptide from a cell.

如本文使用的,如果氨基酸或核苷酸序列(例如,启动子序列、信号肽、终止子序列等)与它被可操作地连接至其的另一个序列在自然界中未缔合,则这两个序列为异源的。例如“异源”多核苷酸是通过实验室技术被引入宿主细胞的任何多核苷酸,并且包括从宿主细胞中取出、进行实验室操作并且然后重新引入宿主细胞的多核苷酸。As used herein, an amino acid or nucleotide sequence (e.g., a promoter sequence, signal peptide, terminator sequence, etc.) is heterologous to another sequence to which it is operably linked if the two sequences are not associated in nature. For example, a "heterologous" polynucleotide is any polynucleotide that is introduced into a host cell by laboratory techniques, and includes polynucleotides that are removed from a host cell, subjected to laboratory manipulation, and then reintroduced into the host cell.

如本文使用的,术语“宿主细胞”和“宿主菌株”是指用于包含本文提供的DNA(例如,编码尿苷磷酸化酶变体的多核苷酸)的表达载体的合适的宿主。在一些实施方案中,宿主细胞是已经用使用如本领域已知的重组DNA技术构建的载体转化或转染的原核细胞或真核细胞。As used herein, the terms "host cell" and "host strain" refer to a suitable host for an expression vector comprising a DNA as provided herein (e.g., a polynucleotide encoding a uridine phosphorylase variant). In some embodiments, the host cell is a prokaryotic or eukaryotic cell that has been transformed or transfected with a vector constructed using recombinant DNA techniques as known in the art.

术语“类似物”意指与参考多肽具有多于70%序列同一性,但少于100%序列同一性(例如,多于75%、78%、80%、83%、85%、88%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%序列同一性)的多肽。在一些实施方案中,类似物意指如下多肽,所述多肽包含一个或更多个非天然存在的氨基酸残基(包括但不限于高精氨酸、鸟氨酸和正缬氨酸)以及天然存在的氨基酸。在一些实施方案中,类似物还包括一个或更多个D-氨基酸残基以及两个或更多个氨基酸残基之间的非肽连接。The term "analog" means a polypeptide having more than 70% sequence identity, but less than 100% sequence identity (e.g., more than 75%, 78%, 80%, 83%, 85%, 88%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity) with a reference polypeptide. In some embodiments, an analog means a polypeptide comprising one or more non-naturally occurring amino acid residues (including but not limited to homoarginine, ornithine and norvaline) and naturally occurring amino acids. In some embodiments, an analog also includes one or more D-amino acid residues and non-peptide connections between two or more amino acid residues.

术语“有效量”意指足以产生期望的结果的量。本领域普通技术人员可以通过使用常规实验确定有效量。The term "effective amount" means an amount sufficient to produce the desired result. An ordinary technician in this field can determine the effective amount by using routine experiments.

术语“分离的”和“纯化的”用于指从与其天然缔合的至少一种其他组分分开的分子(例如,分离的核酸、多肽等)或其他组分。术语“纯化的”不要求绝对纯度,而是意图作为相对定义。The terms "isolated" and "purified" are used to refer to molecules (e.g., isolated nucleic acids, polypeptides, etc.) or other components that are separated from at least one other component with which they are naturally associated. The term "purified" does not require absolute purity, but is intended as a relative definition.

如本文使用的,“立体选择性”是指在化学或酶促反应中一种立体异构体相比另一种立体异构体优先形成。立体选择性可以是部分的,其中一种立体异构体的形成优于另一种,或者其可以是完全的,其中只形成一种立体异构体。当立体异构体是对映异构体时,立体选择性被称为对映选择性,即二者的总和中一种对映体的分数(通常以百分比报告)。本领域通常可选地报告其为根据下式从中计算的对映体过量(“e.e.”)(通常为百分比):[主要对映异构体-次要对映异构体]/[主要对映异构体+次要对映异构体]。当立体异构体是非对映异构体时,立体选择性被称为非对映选择性,即两种非对映异构体的混合物中一种非对映异构体的分数(通常报告为百分比),通常可选地报告为非对映异构体过量(“d.e.”)。对映异构体过量和非对映体过量是立体异构过量的类型。As used herein, "stereoselectivity" refers to the preferential formation of one stereoisomer over another stereoisomer in a chemical or enzymatic reaction. Stereoselectivity can be partial, where the formation of one stereoisomer is superior to another, or it can be complete, where only one stereoisomer is formed. When the stereoisomers are enantiomers, stereoselectivity is referred to as enantioselectivity, i.e. the fraction of one enantiomer in the sum of the two (usually reported as a percentage). The art usually optionally reports it as the enantiomeric excess ("e.e.") (usually a percentage) calculated therefrom according to the following formula: [major enantiomer-minor enantiomer]/[major enantiomer+minor enantiomer]. When the stereoisomers are diastereomers, stereoselectivity is referred to as diastereoselectivity, i.e. the fraction of one diastereomer in a mixture of two diastereomers (usually reported as a percentage), usually optionally reported as diastereomeric excess ("d.e."). Enantiomeric excess and diastereomeric excess are types of stereoisomeric excess.

如本文使用的,“区域选择性”和“区域选择性反应”是指其中一个键形成或断裂方向优先于所有其他可能方向发生的反应。如果区分是完全的,则反应可以是完全(100%)区域选择性的;如果一个位点的反应产物相比其他位点的反应产物占主导地位,则是基本上区域选择性的(至少75%),或者部分区域选择性的(x%,其中百分比取决于感兴趣的反应设置)。As used herein, "regioselectivity" and "regioselective reaction" refer to a reaction in which one direction of bond formation or cleavage occurs in preference to all other possible directions. A reaction can be completely (100%) regioselective if the discrimination is complete, substantially regioselective (at least 75%) if the reaction product of one site predominates over the reaction products of other sites, or partially regioselective (x%, where the percentage depends on the reaction setting of interest).

如本文使用的,“化学选择性”是指在化学或酶促反应中一种产物相比另一种产物优先形成。As used herein, "chemoselectivity" refers to the preferential formation of one product over another in a chemical or enzymatic reaction.

如本文使用的,“pH稳定的”是指与未处理的酶相比,在暴露于高或低的pH(例如4.5-6或8至12)一段时间(例如0.5-24小时)后维持类似活性(例如多于60%至80%)的尿苷磷酸化酶多肽。As used herein, "pH stable" refers to a uridine phosphorylase polypeptide that maintains similar activity (e.g., more than 60% to 80%) after exposure to high or low pH (e.g., 4.5-6 or 8 to 12) for a period of time (e.g., 0.5-24 hours) compared to the untreated enzyme.

如本文使用的,“热稳定”指与暴露于相同的升高的温度的野生型酶相比,在暴露于升高的温度(例如40-80℃)一定时间段(例如0.5-24h)后,保持相似活性(例如多于60%至80%)的尿苷磷酸化酶多肽。As used herein, "thermostable" refers to a uridine phosphorylase polypeptide that retains similar activity (e.g., more than 60% to 80%) after exposure to an elevated temperature (e.g., 40-80°C) for a certain period of time (e.g., 0.5-24 h) compared to a wild-type enzyme exposed to the same elevated temperature.

如本文使用的,“溶剂稳定”指与暴露于相同浓度的相同溶剂的野生型酶相比,在暴露于不同浓度(例如5%-99%)的溶剂(乙醇、异丙醇、二甲基亚砜[DMSO]、四氢呋喃、2-甲基四氢呋喃、丙酮、甲苯、乙酸丁酯、甲基叔丁基醚等)一定时间段(例如0.5h至24h)后,保持相似活性(多于例如60%至80%)的尿苷磷酸化酶多肽。As used herein, "solvent stable" refers to a uridine phosphorylase polypeptide that retains similar activity (e.g., more than 60% to 80%) after exposure to varying concentrations (e.g., 5%-99%) of a solvent (ethanol, isopropanol, dimethyl sulfoxide [DMSO], tetrahydrofuran, 2-methyltetrahydrofuran, acetone, toluene, butyl acetate, methyl tert-butyl ether, etc.) for a certain period of time (e.g., 0.5 h to 24 h) compared to a wild-type enzyme exposed to the same solvent at the same concentration.

如本文使用的,“热稳定且溶剂稳定”是指既热稳定又溶剂稳定的尿苷磷酸化酶多肽。As used herein, "thermostable and solvent stable" refers to a uridine phosphorylase polypeptide that is both thermostable and solvent stable.

如本文使用的,“任选的”和“任选地”意指随后描述的事件或情形可以发生或可以不发生,并且意指该描述包括当该事件或情形发生的情况和其中该事件或情形不发生的情况。本领域普通技术人员将理解,对于被描述为含有一种或更多种任选的取代基的任何分子,仅意在包括空间上可实现的和/或合成上可行的化合物。As used herein, "optional" and "optionally" mean that the subsequently described event or circumstance may or may not occur, and that the description includes instances when the event or circumstance occurs and instances in which the event or circumstance does not occur. One of ordinary skill in the art will understand that for any molecule described as containing one or more optional substituents, only sterically feasible and/or synthetically feasible compounds are intended to be included.

如本文使用的,“任选地被取代的”是指术语或一系列化学基团中的所有后续修饰对象(modifier)。例如,在术语“任选地被取代的芳基烷基”中,分子的“烷基”部分和“芳基”部分可以被取代或可以不被取代,并且对于一系列“任选地被取代的烷基、环烷基、芳基和杂芳基”,烷基、环烷基、芳基和杂芳基基团彼此独立地可以被取代或可以不被取代。As used herein, "optionally substituted" refers to all subsequent modifiers in a term or a series of chemical groups. For example, in the term "optionally substituted arylalkyl", the "alkyl" portion and the "aryl" portion of the molecule may or may not be substituted, and for a series of "optionally substituted alkyl, cycloalkyl, aryl and heteroaryl", the alkyl, cycloalkyl, aryl and heteroaryl groups may or may not be substituted independently of each other.

发明详述DETAILED DESCRIPTION OF THE INVENTION

本发明提供了工程化尿苷磷酸化酶(UP)、具有UP活性的多肽,和编码这些酶的多核苷酸,以及载体和包含这些多核苷酸和多肽的宿主细胞。还提供了用于产生UP酶的方法。本发明还提供了包含UP酶的组合物,以及使用工程化UP酶的方法。本发明尤其可用于药物化合物的产生。The present invention provides engineered uridine phosphorylase (UP), polypeptides having UP activity, and polynucleotides encoding these enzymes, as well as vectors and host cells comprising these polynucleotides and polypeptides. Also provided are methods for producing UP enzymes. The present invention also provides compositions comprising UP enzymes, and methods for using engineered UP enzymes. The present invention is particularly useful for the production of pharmaceutical compounds.

越来越多的非天然核苷类似物被研究用于治疗癌症和病毒感染,诸如COVID-19。工业工艺条件通常要求使用容易可得且成本有效的中间体和底物高效产生非天然核苷。一种这样的中间体是化合物(1)即5’-异丁酰基尿苷。An increasing number of unnatural nucleoside analogs are being investigated for the treatment of cancer and viral infections, such as COVID-19. Industrial process conditions generally require the efficient production of unnatural nucleosides using readily available and cost-effective intermediates and substrates. One such intermediate is compound (1), 5'-isobutyryl uridine.

本领域对于在工业工艺条件下合成化合物(1)的方法,更特别是绿色化学方法存在需求。一种这样的方法是使用生物催化剂或工程化酶来产生化合物(1)。在一些实施方案中,本公开内容提供了用于产生化合物(1)的工程化尿苷磷酸化酶。本公开内容的工程化尿苷磷酸化酶从5’-异丁酰基核糖-1-磷酸底物化合物(2)和尿嘧啶化合物(3)产生化合物(1)。参见下文方案I。There is a need in the art for methods of synthesizing compound (1) under industrial process conditions, more particularly green chemistry methods. One such method is to use a biocatalyst or an engineered enzyme to produce compound (1). In some embodiments, the present disclosure provides an engineered uridine phosphorylase for producing compound (1). The engineered uridine phosphorylase of the present disclosure produces compound (1) from a 5'-isobutyryl ribose-1-phosphate substrate compound (2) and a uracil compound (3). See Scheme 1 below.

在化合物(2)和化合物(3)转化为化合物(1)中还产生无机磷酸盐(未图示)。添加蔗糖磷酸化酶(或许多消耗无机磷酸盐的酶中的任何一种,例如丙酮酸氧化酶)用于驱动可逆反应的平衡朝向化合物(1)产物。Inorganic phosphate is also produced in the conversion of compounds (2) and (3) to compound (1) (not shown). Addition of sucrose phosphorylase (or any of a number of enzymes that consume inorganic phosphate, such as pyruvate oxidase) is used to drive the equilibrium of the reversible reaction toward the product compound (1).

与天然存在的尿苷磷酸化酶相比具有改进的性质的工程化尿苷磷酸化酶可以在相关工艺条件下和/或多酶系统中使用。这些工程化UP酶可以导致化合物(1)的产生提高和/或可以具有其他改进的性质。Engineered uridine phosphorylases with improved properties compared to naturally occurring uridine phosphorylases can be used under relevant process conditions and/or in multi-enzyme systems. These engineered UP enzymes can result in increased production of compound (1) and/or can have other improved properties.

对于具有改进的活性并且在典型的工业条件下操作和/或作为多酶系统的一部分的工程化UP存在需求。本发明解决了这一需求,并提供了适合在工业条件下用于这些反应和其他反应的工程化UP。There is a need for engineered UPs that have improved activity and operate under typical industrial conditions and/or as part of a multi-enzyme system. The present invention addresses this need and provides engineered UPs suitable for use in these and other reactions under industrial conditions.

工程化UP多肽Engineered UP peptides

本发明提供了工程化UP多肽、编码该多肽的多核苷酸、制备该多肽的方法以及使用该多肽的方法。在描述涉及多肽时,应理解,它还描述了编码该多肽的多核苷酸。在一些实施方案中,本发明提供了与野生型UP酶相比具有改进的性质的工程化、非天然存在的UP酶。任何合适的反应条件可用于本发明。在一些实施方案中,使用方法来分析工程化多肽进行磷酸化反应的改进的性质。在一些实施方案中,如下文和实施例中进一步描述的,根据工程化UP、一种或更多种底物、一种或更多种缓冲液、一种或更多种溶剂的浓度或量、pH、包括温度和反应时间的条件和/或工程化UP多肽被固定在固体支持物上的条件来改变反应条件。The present invention provides engineered UP polypeptides, polynucleotides encoding the polypeptides, methods for preparing the polypeptides, and methods for using the polypeptides. When describing a polypeptide, it should be understood that it also describes a polynucleotide encoding the polypeptide. In some embodiments, the present invention provides an engineered, non-naturally occurring UP enzyme with improved properties compared to a wild-type UP enzyme. Any suitable reaction conditions can be used in the present invention. In some embodiments, methods are used to analyze the improved properties of the engineered polypeptide for phosphorylation reactions. In some embodiments, as further described below and in the Examples, the reaction conditions are changed according to the concentration or amount of the engineered UP, one or more substrates, one or more buffers, one or more solvents, pH, conditions including temperature and reaction time, and/or the conditions under which the engineered UP polypeptide is fixed on a solid support.

在一些实施方案中,利用另外的反应组分或另外的技术来补充反应条件。在一些实施方案中,这些包括采取措施来稳定酶或防止酶失活、减少产物抑制、使反应平衡向期望的产物形成移动。In some embodiments, additional reaction components or additional techniques are used to supplement reaction conditions. In some embodiments, these include taking measures to stabilize the enzyme or prevent enzyme inactivation, reduce product inhibition, and shift the reaction equilibrium toward the desired product formation.

在一些另外的实施方案中,用于将底物化合物转化成产物化合物的任何上文描述的方法还可以包括一个或更多个选自以下的步骤:产物化合物的提取、分离、纯化、结晶、过滤和/或冻干。用于从通过本文提供的方法产生的生物催化反应混合物提取、分离、纯化和/或结晶产物的方法、技术和方案是普通技术人员已知的和/或通过常规实验可获得。此外,在下文的实施例中提供了说明性方法。In some other embodiments, any of the above-described methods for converting a substrate compound into a product compound may also include one or more steps selected from the following: extraction, separation, purification, crystallization, filtration and/or lyophilization of the product compound. Methods, techniques and protocols for extracting, separating, purifying and/or crystallizing products from a biocatalytic reaction mixture produced by the methods provided herein are known to those of ordinary skill and/or are obtainable by routine experimentation. In addition, illustrative methods are provided in the examples hereinafter.

编码工程化多肽的工程化UP多核苷酸、表达载体和宿主细胞Engineered UP polynucleotides encoding engineered polypeptides, expression vectors and host cells

本发明提供了编码本文描述的工程化酶多肽的多核苷酸。在一些实施方案中,多核苷酸被可操作地连接至控制基因表达的一个或更多个异源调节序列,以创建能够表达多肽的重组多核苷酸。在一些实施方案中,含有至少一种编码工程化酶多肽的异源多核苷酸的表达构建体被引入适当的宿主细胞以表达对应的酶多肽。The present invention provides polynucleotides encoding engineered enzyme polypeptides described herein. In some embodiments, the polynucleotides are operably linked to one or more heterologous regulatory sequences that control gene expression to create a recombinant polynucleotide capable of expressing a polypeptide. In some embodiments, an expression construct containing at least one heterologous polynucleotide encoding an engineered enzyme polypeptide is introduced into an appropriate host cell to express the corresponding enzyme polypeptide.

如对本领域技术人员将是明显的,蛋白序列的可得性和对应于多种氨基酸的密码子的知识提供了能够编码主题多肽的所有多核苷酸的说明。遗传编码的简并性(其中相同氨基酸由可选的或同义的密码子编码)允许极大数目的核酸被制备,所有这些核酸编码工程化酶(例如,UP)多肽。因此,本发明提供了用于产生可以被制备的每一种可能的酶多核苷酸变异的方法和组合物,所述酶多核苷酸通过选择基于可能的密码子选项的组合来编码本文描述的酶多肽,并且所有这样的变异都被认为针对本文描述的任何多肽具体地公开,包括实施例中(例如,在各个表格中)呈现的氨基酸序列。As will be apparent to those skilled in the art, the availability of protein sequences and the knowledge of codons corresponding to a variety of amino acids provide an explanation of all polynucleotides that can encode subject polypeptides. The degeneracy of genetic coding (wherein the same amino acid is encoded by optional or synonymous codons) allows a very large number of nucleic acids to be prepared, all of which encode engineered enzymes (e.g., UP) polypeptides. Therefore, the invention provides methods and compositions for producing each possible enzyme polynucleotide variation that can be prepared, the enzyme polynucleotides encode enzyme polypeptides described herein by selecting a combination based on possible codon options, and all such variations are considered to be specifically disclosed for any polypeptide described herein, including the amino acid sequences presented in the embodiments (e.g., in each table).

在一些实施方案中,密码子优选地被优化以供所选择的宿主细胞用于蛋白产生。例如,细菌中使用的优选的密码子通常用于在细菌中的表达。因此,编码工程化酶多肽的密码子优化的多核苷酸在全长编码区中约40%、50%、60%、70%、80%、90%或大于90%的密码子位置处含有优选的密码子。In some embodiments, codons are preferably optimized for protein production by the selected host cell. For example, the preferred codons used in bacteria are generally used for expression in bacteria. Therefore, the codon-optimized polynucleotides encoding engineered enzyme polypeptides contain preferred codons at about 40%, 50%, 60%, 70%, 80%, 90% or greater than 90% of the codon positions in the full-length coding region.

在一些实施方案中,酶多核苷酸编码具有酶活性与本文公开的性质的工程化多肽,其中所述多肽包含与选自本文提供的SEQ ID NO的参考序列或任何变体(例如实施例中提供的那些)的氨基酸序列具有至少60%、65%、70%、75%、80%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多的同一性的氨基酸序列,和与一种或更多种参考多核苷酸或如实施例中公开的任何变体的氨基酸序列相比的一个或更多个残基差异(例如1个、2个、3个、4个、5个、6个、7个、8个、9个、10个或更多个氨基酸残基位置)。在一些实施方案中,参考多肽序列选自SEQ ID NO:2、SEQ ID NO:4、SEQ IDNO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID NO:868。In some embodiments, the enzyme polynucleotide encodes an engineered polypeptide having enzyme activity and properties disclosed herein, wherein the polypeptide comprises an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identity to an amino acid sequence selected from a reference sequence of a SEQ ID NO provided herein, or any variant (e.g., those provided in the Examples), and one or more residue differences (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more amino acid residue positions) compared to the amino acid sequence of one or more reference polynucleotides or any variant as disclosed in the Examples. In some embodiments, the reference polypeptide sequence is selected from SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:246, SEQ ID NO:594, SEQ ID NO:776 and/or SEQ ID NO:868.

在一些实施方案中,多核苷酸能够与选自本文提供的任何多核苷酸序列的参考多核苷酸序列或其互补序列或者编码本文提供的任何的变体酶多肽的多核苷酸序列在高度严格条件下杂交。在一些实施方案中,能够在高度严格条件下杂交的多核苷酸编码酶多肽,所述酶多肽包含与参考序列相比具有一个或更多个残基差异的氨基酸序列。In some embodiments, polynucleotide can be selected from any polynucleotide sequence provided herein or its complementary sequence or the polynucleotide sequence of encoding any variant enzyme polypeptide provided herein hybridized under high stringency conditions. In some embodiments, the polynucleotide encoding enzyme polypeptide that can hybridize under high stringency conditions comprises an amino acid sequence with one or more residue differences compared to the reference sequence.

在一些实施方案中,编码本文的工程化酶多肽中的任一个的分离的多核苷酸以各种方式被操纵,以促进酶多肽的表达。在一些实施方案中,编码酶多肽的多核苷酸构成表达载体,其中存在一个或更多个控制序列来调节酶多核苷酸和/或多肽的表达。根据所用的表达载体,在分离的多核苷酸插入载体之前对分离的多核苷酸的操纵可以是期望的或必要的。利用重组DNA方法修饰多核苷酸和核酸序列的技术是本领域熟知的。在一些实施方案中,控制序列包括,除其他以外,启动子、前导序列、多腺苷酸化序列、前肽序列、信号肽序列和转录终止子。在一些实施方案中,基于宿主细胞的选择对合适的启动子进行选择。对于细菌宿主细胞,用于指导本公开内容的核酸构建体的转录的合适启动子包括但不限于从以下获得的启动子:大肠杆菌lac操纵子、天蓝色链霉菌(Streptomyces coelicolor)琼脂糖酶基因(dagA)、枯草芽孢杆菌(Bacillus subtilis)果聚糖蔗糖酶基因(sacB)、地衣芽孢杆菌(Bacillus licheniformis)α-淀粉酶基因(amyL)、嗜热脂肪芽孢杆菌麦芽糖淀粉酶基因(amyM)、解淀粉芽孢杆菌(Bacillus amyloliquefaciens)α-淀粉酶基因(amyQ)、地衣芽孢杆菌青霉素酶基因(penP)、枯草芽孢杆菌xylA和xylB基因,以及原核β-内酰胺酶基因(参见,例如,Villa-Kamaroff等人,Proc.Natl Acad.Sci.USA 75:3727-3731[1978]),以及tac启动子(参见,例如,DeBoer等人,Proc.Natl Acad.Sci.USA 80:21-25[1983])。用于丝状真菌宿主细胞的示例性启动子包括但不限于从以下的基因获得的启动子:米曲霉(Aspergillus oryzae)TAKA淀粉酶、米黑根毛霉(Rhizomucor miehei)天冬氨酸蛋白酶、黑曲霉(Aspergillus niger)中性α-淀粉酶、黑曲霉酸稳定型α-淀粉酶、黑曲霉或泡盛曲霉(Aspergillus awamori)葡糖淀粉酶(glaA)、米黑根毛霉脂肪酶、米曲霉碱性蛋白酶、米曲霉磷酸丙糖异构酶、构巢曲霉(Aspergillus nidulans)乙酰胺酶和尖孢镰刀菌(Fusariumoxysporum)胰蛋白酶样蛋白酶(参见,例如WO 96/00787),以及NA2-tpi启动子(来自黑曲霉中性α-淀粉酶基因和米曲霉磷酸丙糖异构酶基因的启动子的杂合体),和其突变体、截短的和杂合的启动子。示例性酵母细胞启动子可以来自以下的基因:酿酒酵母(Saccharomycescerevisiae)烯醇酶(ENO-1)、酿酒酵母半乳糖激酶(GAL1)、酿酒酵母醇脱氢酶/甘油醛-3-磷酸脱氢酶(ADH2/GAP)和酿酒酵母3-磷酸甘油酸激酶。用于酵母宿主细胞的其他有用的启动子是本领域已知的(参见例如,Romanos等人,Yeast 8:423-488[1992])。In some embodiments, the polynucleotide of any separation in the engineered enzyme polypeptide of encoding this paper is manipulated in various ways to promote the expression of enzyme polypeptide.In some embodiments, the polynucleotide of encoding enzyme polypeptide constitutes expression vector, wherein there is one or more control sequence to regulate the expression of enzyme polynucleotide and/or polypeptide.According to the expression vector used, the manipulation of the polynucleotide of separation before the polynucleotide of separation is inserted into the vector can be desired or necessary.The technology of utilizing recombinant DNA method to modify polynucleotide and nucleotide sequence is well known in the art.In some embodiments, control sequence comprises, among others, promoter, leader sequence, polyadenylation sequence, propeptide sequence, signal peptide sequence and transcription terminator.In some embodiments, suitable promoter is selected based on the selection of host cell. For bacterial host cells, suitable promoters for directing transcription of the nucleic acid constructs of the present disclosure include, but are not limited to, promoters obtained from the Escherichia coli lac operon, the Streptomyces coelicolor agarase gene (dagA), the Bacillus subtilis levansucrase gene (sacB), the Bacillus licheniformis alpha-amylase gene (amyL), the Bacillus stearothermophilus maltogenic amylase gene (amyM), the Bacillus amyloliquefaciens alpha-amylase gene (amyQ), the Bacillus licheniformis penicillinase gene (penP), the Bacillus subtilis xylA and xylB genes, and prokaryotic β-lactamase genes (see, e.g., Villa-Kamaroff et al., Proc. Natl Acad. Sci. USA 75:3727-3731 [1978]), and the tac promoter (See, e.g., DeBoer et al., Proc. Natl Acad. Sci. USA 80:21-25 [1983]). Exemplary promoters for filamentous fungal host cells include, but are not limited to, promoters obtained from the genes for Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid-stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline proteinase, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, and Fusarium oxysporum trypsin-like protease (see, e.g., WO 99/05455). 96/00787), and the NA2-tpi promoter (a hybrid of the promoters from the Aspergillus niger neutral α-amylase gene and the Aspergillus oryzae triose phosphate isomerase gene), and mutants, truncated, and hybrid promoters thereof. Exemplary yeast cell promoters can be derived from the genes for Saccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiae galactokinase (GAL1), Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH2/GAP), and Saccharomyces cerevisiae 3-phosphoglycerate kinase. Other useful promoters for yeast host cells are known in the art (see, e.g., Romanos et al., Yeast 8:423-488 [1992]).

在一些实施方案中,控制序列也是合适的转录终止子序列(即由宿主细胞识别以终止转录的序列)。在一些实施方案中,终止子序列可操作地连接至编码酶多肽的核酸序列的3’末端。在选择的宿主细胞中有功能的任何合适的终止子可用于本发明中。用于丝状真菌宿主细胞的示例性转录终止子可以从以下的基因获得:米曲霉TAKA淀粉酶、黑曲霉葡糖淀粉酶、构巢曲霉邻氨基苯甲酸合酶、黑曲霉α-葡萄糖苷酶和尖孢镰刀菌胰蛋白酶样蛋白酶。用于酵母宿主细胞的示例性终止子可以从以下的基因获得:酿酒酵母烯醇酶、酿酒酵母细胞色素C(CYC1)和酿酒酵母甘油醛-3-磷酸脱氢酶。用于酵母宿主细胞的其他有用的终止子是本领域已知的(参见例如,Romanos等人,上文)。In some embodiments, the control sequence is also a suitable transcription terminator sequence (i.e., a sequence recognized by the host cell to terminate transcription). In some embodiments, the terminator sequence is operably connected to the 3' end of the nucleic acid sequence encoding the enzyme polypeptide. Any suitable terminator that is functional in the selected host cell can be used in the present invention. Exemplary transcription terminators for filamentous fungal host cells can be obtained from the following genes: Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Aspergillus niger α-glucosidase, and Fusarium oxysporum trypsin-like protease. Exemplary terminators for yeast host cells can be obtained from the following genes: Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C (CYC1), and Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful terminators for yeast host cells are known in the art (see, e.g., Romanos et al., supra).

在一些实施方案中,控制序列也是合适的前导序列(即对由宿主细胞的翻译重要的mRNA的非翻译区)。在一些实施方案中,前导序列可操作地连接至编码酶多肽的核酸序列的5’末端。在选择的宿主细胞中有功能的任何合适的前导序列可用于本发明中。用于丝状真菌宿主细胞的示例性前导序列从以下的基因获得:米曲霉TAKA淀粉酶和构巢曲霉磷酸丙糖异构酶。用于酵母宿主细胞的合适的前导序列从以下的基因获得:酿酒酵母烯醇化酶(ENO-1)、酿酒酵母3-磷酸甘油酸激酶、酿酒酵母α-因子和酿酒酵母醇脱氢酶/甘油醛-3-磷酸脱氢酶(ADH2/GAP)。In some embodiments, the control sequence is also a suitable leader sequence (i.e., a non-translated region of an mRNA that is important for translation by the host cell). In some embodiments, the leader sequence is operably linked to the 5' end of the nucleic acid sequence encoding the enzyme polypeptide. Any suitable leader sequence that is functional in the selected host cell can be used in the present invention. Exemplary leaders for filamentous fungal host cells are obtained from the following genes: Aspergillus oryzae TAKA amylase and Aspergillus nidulans triosephosphate isomerase. Suitable leaders for yeast host cells are obtained from the following genes: Saccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiae 3-phosphoglycerate kinase, Saccharomyces cerevisiae α-factor, and Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH2/GAP).

在一些实施方案中,控制序列也是多腺苷酸化序列(即可操作地连接至核酸序列的3’末端的序列,并且其在转录时,被宿主细胞识别为将多腺苷残基添加至转录的mRNA的信号)。在选择的宿主细胞中有功能的任何合适的多腺苷酸化序列可用于本发明中。用于丝状真菌宿主细胞的示例性多腺苷酸化序列包括但不限于以下的基因:米曲霉TAKA淀粉酶、黑曲霉葡糖淀粉酶、构巢曲霉邻氨基苯甲酸合酶、尖孢镰刀菌胰蛋白酶样蛋白酶和黑曲霉α-葡糖苷酶。用于酵母宿主细胞的有用的多腺苷酸化序列是已知的(参见例如Guo和Sherman,Mol.Cell.Bio.,15:5983-5990[1995])。In some embodiments, the control sequence is also a polyadenylation sequence (i.e., a sequence that is operably linked to the 3' end of the nucleic acid sequence and, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to the transcribed mRNA). Any suitable polyadenylation sequence that is functional in the host cell of choice can be used in the present invention. Exemplary polyadenylation sequences for filamentous fungal host cells include, but are not limited to, the following genes: Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Fusarium oxysporum trypsin-like protease, and Aspergillus niger alpha-glucosidase. Useful polyadenylation sequences for yeast host cells are known (see, e.g., Guo and Sherman, Mol. Cell. Bio., 15:5983-5990 [1995]).

在一些实施方案中,控制序列也是信号肽(即编码连接至多肽的氨基末端并将编码的多肽引导到细胞的分泌途径的氨基酸序列的编码区)。在一些实施方案中,核酸序列的编码序列的5’末端固有地包含信号肽编码区,其与编码分泌的多肽的编码区的区段符合翻译阅读框地(in translation reading frame)天然地连接。可选择地,在一些实施方案中,编码序列的5’末端包含对编码序列而言外来的信号肽编码区。将表达的多肽引导到选择的宿主细胞的分泌途径中的任何合适的信号肽编码区可用于一种或更多种工程化多肽的表达。用于细菌宿主细胞的有效信号肽编码区是包括但不限于从以下的基因获得的那些信号肽编码区:芽孢杆菌NClB 11837麦芽糖淀粉酶、嗜热脂肪芽孢杆菌α-淀粉酶、地衣芽孢杆菌枯草杆菌蛋白酶、地衣芽孢杆菌β-内酰胺酶、嗜热脂肪芽孢杆菌中性蛋白酶(nprT、nprS、nprM)和枯草芽孢杆菌prsA。另外的信号肽是本领域已知的(参见例如,Simonen和Palva,Microbiol.Rev.,57:109-137[1993])。在一些实施方案中,对于丝状真菌宿主细胞有效的信号肽编码区包括但不限于从以下的基因获得的信号肽编码区:米曲霉TAKA淀粉酶、黑曲霉中性淀粉酶、黑曲霉葡糖淀粉酶、米黑根毛霉天冬氨酸蛋白酶、特异腐质霉(Humicolainsolens)纤维素酶和柔毛腐质霉(Humicola lanuginosa)脂肪酶。用于酵母宿主细胞的有用的信号肽包括但不限于来自以下的基因的那些:酿酒酵母α-因子和酿酒酵母转化酶。In some embodiments, the control sequence is also a signal peptide (i.e., a coding region encoding an amino acid sequence that is linked to the amino terminus of a polypeptide and directs the encoded polypeptide to the secretory pathway of the cell). In some embodiments, the 5' end of the coding sequence of the nucleic acid sequence inherently contains a signal peptide coding region that is naturally linked in translation reading frame to the segment of the coding region encoding the secreted polypeptide. Alternatively, in some embodiments, the 5' end of the coding sequence contains a signal peptide coding region that is foreign to the coding sequence. Any suitable signal peptide coding region that directs the expressed polypeptide to the secretory pathway of the selected host cell can be used for expression of one or more engineered polypeptides. Effective signal peptide coding regions for bacterial host cells include, but are not limited to, those obtained from the following genes: Bacillus NClB 11837 maltogenic amylase, Bacillus stearothermophilus alpha-amylase, Bacillus licheniformis subtilisin, Bacillus licheniformis beta-lactamase, Bacillus stearothermophilus neutral protease (nprT, nprS, nprM), and Bacillus subtilis prsA. Additional signal peptides are known in the art (see, e.g., Simonen and Palva, Microbiol. Rev., 57: 109-137 [1993]). In some embodiments, effective signal peptide coding regions for filamentous fungal host cells include, but are not limited to, those obtained from the genes for Aspergillus oryzae TAKA amylase, Aspergillus niger neutral amylase, Aspergillus niger glucoamylase, Rhizomucor miehei aspartic proteinase, Humicola insolens cellulase, and Humicola lanuginosa lipase. Useful signal peptides for yeast host cells include, but are not limited to, those from the genes for Saccharomyces cerevisiae α-factor and Saccharomyces cerevisiae invertase.

在一些实施方案中,控制序列也是编码定位在多肽的氨基末端处的氨基酸序列的前肽编码区。产生的多肽被称为“前酶(proenzyme)”、“前多肽(propolypeptide)”或“酶原(zymogen)”。前多肽可以通过催化或自动催化前肽从前多肽的裂解被转化为成熟活性多肽。前肽编码区可以从包括但不限于以下的基因的任何合适的来源获得:枯草芽孢杆菌碱性蛋白酶(aprE)、枯草芽孢杆菌中性蛋白酶(nprT)、酿酒酵母α-因子、米黑根毛霉天冬氨酸蛋白酶和嗜热毁丝霉(Myceliophthora thermophila)乳糖酶(参见例如WO 95/33836)。在信号肽和前肽区域两者均存在于多肽的氨基末端时,前肽区域紧邻多肽的氨基末端定位并且信号肽区域紧邻前肽区域的氨基末端定位。In some embodiments, the control sequence is also a propeptide coding region encoding an amino acid sequence positioned at the amino terminus of a polypeptide. The polypeptide produced is referred to as a "proenzyme", "propolypeptide" or "zymogen". The propolypeptide can be converted into a mature active polypeptide by catalyzing or autocatalyzing the cleavage of the propeptide from the propolypeptide. The propeptide coding region can be obtained from any suitable source including but not limited to the following genes: Bacillus subtilis alkaline protease (aprE), Bacillus subtilis neutral protease (nprT), Saccharomyces cerevisiae α-factor, Rhizomucor miehei aspartic protease and Myceliophthora thermophila lactase (see, e.g., WO 95/33836). When both the signal peptide and propeptide regions are present at the amino terminus of a polypeptide, the propeptide region is positioned adjacent to the amino terminus of the polypeptide and the signal peptide region is positioned adjacent to the amino terminus of the propeptide region.

在一些实施方案中,还利用了调节序列。这些序列促进相对于宿主细胞生长的多肽表达调节。调节系统的实例是引起基因的表达响应于化学或物理刺激(包括调节性化合物的存在)被开启或关闭的那些。在原核宿主细胞中,合适的调节序列包括但不限于lac、tac和trp操纵子系统。在酵母宿主细胞中,合适的调节系统包括但不限于ADH2系统或GAL1系统。在丝状真菌中,合适的调节序列包括但不限于TAKAα-淀粉酶启动子、黑曲霉葡糖淀粉酶启动子和米曲霉葡糖淀粉酶启动子。In some embodiments, regulatory sequences are also utilized. These sequences promote the regulation of polypeptide expression relative to host cell growth. Examples of regulatory systems are those that cause the expression of genes to be turned on or off in response to chemical or physical stimuli (including the presence of regulatory compounds). In prokaryotic host cells, suitable regulatory sequences include but are not limited to lac, tac and trp operator systems. In yeast host cells, suitable regulatory systems include but are not limited to ADH2 systems or GAL1 systems. In filamentous fungi, suitable regulatory sequences include but are not limited to TAKA alpha-amylase promoters, Aspergillus niger glucoamylase promoters and Aspergillus oryzae glucoamylase promoters.

在另一方面,本发明涉及包含编码工程化酶多肽的多核苷酸,以及根据其待引入的宿主的类型,一个或更多个表达调控区诸如启动子和终止子、复制起点等的重组表达载体。在一些实施方案中,本文描述的各种核酸和控制序列连接在一起以产生重组表达载体,所述重组表达载体包含一个或更多个方便的限制性位点,以允许在这样的位点插入或取代编码酶多肽的核酸序列。可选择地,在一些实施方案中,本发明的核酸序列通过将核酸序列或包含该序列的核酸构建体插入到用于表达的合适的载体中来表达。在涉及产生表达载体的一些实施方案中,编码序列位于载体中,使得编码序列与用于表达的适当的控制序列可操作地连接。On the other hand, the present invention relates to a polynucleotide comprising an engineered enzyme polypeptide, and one or more expression regulatory regions such as promoters and terminators, replication origins, etc., according to the type of host to be introduced. In some embodiments, various nucleic acids and control sequences described herein are linked together to produce a recombinant expression vector comprising one or more convenient restriction sites to allow insertion or replacement of the nucleic acid sequence encoding the enzyme polypeptide at such a site. Alternatively, in some embodiments, the nucleic acid sequence of the present invention is expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into a suitable vector for expression. In some embodiments relating to the generation of an expression vector, the coding sequence is located in a vector so that the coding sequence is operably linked to a suitable control sequence for expression.

重组表达载体可以是任何合适的载体(例如,质粒或病毒),其可以方便地进行重组DNA程序并且引起酶多核苷酸序列的表达。载体的选择通常取决于载体与待引入载体的宿主细胞的相容性。载体可以是线性质粒或闭合的环状质粒。The recombinant expression vector can be any suitable vector (e.g., plasmid or virus) that can be easily subjected to recombinant DNA procedures and cause expression of the enzyme polynucleotide sequence. The choice of vector generally depends on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector can be a linear plasmid or a closed circular plasmid.

在一些实施方案中,表达载体为自主复制载体(即,作为染色体外的实体存在的载体,其复制独立于染色体复制,诸如质粒、染色体外元件、微型染色体或人工染色体)。载体可以包含用于确保自我复制的任何工具(means)。在一些可选择的实施方案中,载体是其中当被引入宿主细胞中时,被整合到基因组中并与其被整合进的一条或更多条染色体一起复制的载体。此外,在一些实施方案中,利用了单一载体或质粒,或者一起包含待引入宿主细胞的基因组中的总DNA的两种或更多种载体或质粒,和/或转座子。In some embodiments, the expression vector is an autonomously replicating vector (i.e., a vector that exists as an extrachromosomal entity, whose replication is independent of chromosomal replication, such as a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome). The vector may include any means for ensuring self-replication. In some selectable embodiments, the vector is a vector that, when introduced into a host cell, is integrated into the genome and replicated with one or more chromosomes into which it is integrated. In addition, in some embodiments, a single vector or plasmid is utilized, or two or more vectors or plasmids that together contain the total DNA to be introduced into the genome of the host cell, and/or a transposon.

在一些实施方案中,表达载体包含允许容易选择转化的细胞的一个或更多个选择标记(selectable marker)。“选择标记”是其产物提供杀生物剂或病毒抗性、对重金属的抗性、对营养缺陷型的原养性(prototrophy to auxotrophs)等的基因。细菌的选择标记的实例包括但不限于,来自枯草芽孢杆菌或地衣芽孢杆菌的dal基因,或赋予抗生素抗性诸如氨苄青霉素、卡那霉素、氯霉素或四环素抗性的标记。用于酵母宿主细胞的合适的标记包括但不限于ADE2、HIS3、LEU2、LYS2、MET3、TRP1和URA3。用于在丝状真菌宿主细胞中使用的选择标记包括但不限于amdS(乙酰胺酶;例如来自构巢曲霉(A.nidulans)或米曲霉(A.orzyae))、argB(鸟氨酸氨甲酰转移酶)、bar(膦丝菌素乙酰转移酶;例如来自吸水链霉菌(S.Hygroscopicus))、hph(潮霉素磷酸转移酶)、niaD(硝酸还原酶)、pyrG(乳清苷-5’-磷酸脱羧酶;例如来自构巢曲霉或米曲霉)、sC(硫酸腺苷酰转移酶(sulfateadenyltransferase))和trpC(邻氨基苯甲酸合酶),以及其等同物。In some embodiments, the expression vector comprises one or more selectable markers that allow easy selection of transformed cells. A "selectable marker" is a gene whose product provides biocide or virus resistance, resistance to heavy metals, prototrophy to auxotrophs, etc. Examples of bacterial selectable markers include, but are not limited to, dal genes from Bacillus subtilis or Bacillus licheniformis, or markers that confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol or tetracycline resistance. Suitable markers for yeast host cells include, but are not limited to, ADE2, HIS3, LEU2, LYS2, MET3, TRP1 and URA3. Selectable markers for use in filamentous fungal host cells include, but are not limited to, amdS (acetamidase; e.g., from A. nidulans or A. orzyae), argB (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase; e.g., from S. Hygroscopicus), hph (hygromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5'-phosphate decarboxylase; e.g., from A. nidulans or A. oryzae), sC (sulfate adenyltransferase), and trpC (anthranilate synthase), and equivalents thereof.

在另一个方面,本发明提供了包含至少一种编码本发明的至少一种工程化酶多肽的多核苷酸的宿主细胞,所述多核苷酸被可操作地连接至一个或更多个控制序列用于在宿主细胞中表达工程化酶。适于在表达由本发明的表达载体编码的多肽中使用的宿主细胞是本领域熟知的,并且包括但不限于细菌细胞,诸如大肠杆菌、河流弧菌(Vibriofluvialis)、链霉菌属(Streptomyces)和鼠伤寒沙门氏菌(Salmonella typhimurium)细胞;真菌细胞,诸如酵母细胞(例如,酿酒酵母或巴斯德毕赤酵母(Pichia pastoris)(ATCC保藏登录号201178));昆虫细胞,诸如果蝇属(Drosophila)S2和夜蛾属(Spodoptera)Sf9细胞;动物细胞,诸如CHO、COS、BHK、293和Bowes黑素瘤细胞;和植物细胞。示例性宿主细胞还包括各种大肠杆菌菌株(例如,W3110(ΔfhuA)和BL21)。细菌的选择标记的实例包括但不限于,来自枯草芽孢杆菌或地衣芽孢杆菌的dal基因,或赋予抗生素抗性诸如氨苄青霉素、卡那霉素、氯霉素和/或四环素抗性的标记。In another aspect, the present invention provides a host cell comprising at least one polynucleotide encoding at least one engineered enzyme polypeptide of the present invention, the polynucleotide being operably connected to one or more control sequences for expressing engineered enzymes in host cells. Host cells suitable for use in expressing polypeptides encoded by the expression vector of the present invention are well known in the art, and include but are not limited to bacterial cells, such as Escherichia coli, Vibrio fluvialis, Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells (e.g., Saccharomyces cerevisiae or Pichia pastoris (ATCC deposit accession number 201178)); insect cells, such as Drosophila S2 and Spodoptera Sf9 cells; animal cells, such as CHO, COS, BHK, 293 and Bowes melanoma cells; and plant cells. Exemplary host cells also include various Escherichia coli strains (e.g., W3110 (ΔfhuA) and BL21). Examples of bacterial selectable markers include, but are not limited to, the dal gene from Bacillus subtilis or Bacillus licheniformis, or markers that confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol, and/or tetracycline resistance.

在一些实施方案中,本发明的表达载体含有允许将载体整合到宿主细胞基因组或独立于基因组在细胞中自主复制载体的元件。在涉及整合到宿主细胞基因组中的一些实施方案中,载体依赖于编码多肽的核酸序列或载体的任何其他元件从而通过同源或非同源重组将载体整合到基因组中。In some embodiments, the expression vectors of the invention contain elements that allow the vector to be integrated into the host cell genome or to autonomously replicate the vector in the cell independently of the genome. In some embodiments involving integration into the host cell genome, the vector relies on the nucleic acid sequence encoding the polypeptide or any other element of the vector to integrate the vector into the genome by homologous or nonhomologous recombination.

在一些可选的实施方案中,表达载体含有用于指导通过同源重组整合到宿主细胞的基因组中的另外的核酸序列。另外的核酸序列使载体能够在染色体中的精确位置处整合到宿主细胞基因组中。为了增加在精确位置处整合的可能性,整合元件优选地包含足够数目的核苷酸,诸如100个至10,000个碱基对,优选地400个至10,000个碱基对,并且最优选地800个至10,000个碱基对,它们与相应的靶序列高度同源,以提高同源重组的可能性。整合元件可以是与宿主细胞的基因组中的靶序列同源的任何序列。此外,整合元件可以是非编码或编码核酸序列。在另一方面,载体可以通过非同源重组整合到宿主细胞的基因组中。In some optional embodiments, expression vector contains other nucleic acid sequences for guiding integration into the genome of host cell by homologous recombination.Other nucleic acid sequences enable carrier to be integrated into the host cell genome at the precise position in chromosome.In order to increase the possibility of integration at the precise position, integration element preferably comprises a sufficient number of nucleotides, such as 100 to 10,000 base pairs, preferably 400 to 10,000 base pairs, and most preferably 800 to 10,000 base pairs, which are highly homologous to the corresponding target sequence, to improve the possibility of homologous recombination.Integration element can be any sequence homologous to the target sequence in the genome of host cell.In addition, integration element can be non-coding or encoding nucleic acid sequence.On the other hand, carrier can be integrated into the genome of host cell by non-homologous recombination.

对于自主复制,所述载体还可以包含使所述载体能够在所讨论的宿主细胞中自主复制的复制起点。细菌复制起点的实例是P15A ori或者允许在大肠杆菌中复制的质粒pBR322、pUC19、pACYCl77(该质粒具有P15A ori)或pACYC184和允许在芽孢杆菌中复制的pUB110、pE194或pTA1060的复制起点。用于酵母宿主细胞的复制起点的实例是2微米复制起点、ARS1、ARS4、ARS1和CEN3的组合以及ARS4和CEN6的组合。复制起点可以是具有使其在宿主细胞中的功能对温度敏感的突变的复制起点(参见例如,Ehrlich,Proc.Natl.Acad.Sci.USA 75:1433[1978])。For autonomous replication, the vector may also include an origin of replication that enables the vector to autonomously replicate in the host cell in question. Examples of bacterial origins of replication are P15A ori or plasmids pBR322, pUC19, pACYC177 (the plasmid has P15A ori) or pACYC184 that allow replication in Escherichia coli and pUB110, pE194 or pTA1060 that allow replication in bacillus. Examples of origins of replication for yeast host cells are 2 micron origins of replication, ARS1, ARS4, a combination of ARS1 and CEN3, and a combination of ARS4 and CEN6. The origin of replication may be an origin of replication with a mutation that makes its function in the host cell sensitive to temperature (see, e.g., Ehrlich, Proc. Natl. Acad. Sci. USA 75:1433 [1978]).

在一些实施方案中,本发明的核酸序列的多于一个拷贝被插入到宿主细胞中以增加基因产物的产生。核酸序列拷贝数的增加可以通过将序列的至少一个另外的拷贝整合到宿主细胞基因组中或者通过将可扩增的选择标记基因与核酸序列包括在一起获得,其中含有选择标记基因的扩增拷贝、以及因此核酸序列的另外拷贝的细胞可以通过在存在适当的选择剂(selectable agent)的情况下培养细胞选择。In some embodiments, more than one copy of the nucleic acid sequence of the present invention is inserted into the host cell to increase the production of the gene product. The increase in the number of nucleic acid sequence copies can be obtained by integrating at least one other copy of the sequence into the host cell genome or by including an amplifiable selection marker gene with the nucleic acid sequence, wherein the cells containing the amplified copy of the selection marker gene and therefore the other copy of the nucleic acid sequence can be selected by culturing cells in the presence of a suitable selection agent (selectable agent).

用于本发明的许多表达载体是商业上可得的。合适的商业表达载体包括但不限于p3xFLAGTMTM表达载体(Sigma-Aldrich Chemicals),其包含CMV启动子和用于在哺乳动物宿主细胞中表达的hGH多聚腺苷酸化位点、以及用于在大肠杆菌中扩增的pBR322复制起点以及氨苄青霉素抗性标记。其他合适的表达载体包括但不限于pBluescriptII SK(-)和pBK-CMV(Stratagene),以及源自pBR322(Gibco BRL)、pUC(Gibco BRL)、pREP4、pCEP4(Invitrogen)或pPoly的质粒(参见例如,Lathe等人,Gene 57:193-201[1987])。Many expression vectors for the present invention are commercially available. Suitable commercial expression vectors include, but are not limited to, the p3xFLAG ™ TM expression vector (Sigma-Aldrich Chemicals), which contains a CMV promoter and an hGH polyadenylation site for expression in mammalian host cells and a pBR322 origin of replication and an ampicillin resistance marker for amplification in Escherichia coli. Other suitable expression vectors include, but are not limited to, pBluescriptII SK(-) and pBK-CMV (Stratagene), and plasmids derived from pBR322 (Gibco BRL), pUC (Gibco BRL), pREP4, pCEP4 (Invitrogen), or pPoly (see, e.g., Lathe et al., Gene 57:193-201 [1987]).

因此,在一些实施方案中,将包含编码至少一种变体尿苷磷酸化酶的序列的载体转化到宿主细胞中,以便允许载体的繁殖和变体尿苷磷酸化酶的表达。在一些实施方案中,变体尿苷磷酸化酶被翻译后修饰以去除信号肽,并且在某些情况下,可以在分泌后被裂解。在一些实施方案中,将上文描述的转化的宿主细胞在允许变体尿苷磷酸化酶表达的条件下在合适的营养培养基中培养。用于培养宿主细胞的任何合适的培养基都可用于本发明,包括但不限于含有适当补充剂的基本培养基或复合培养基。在一些实施方案中,宿主细胞在HTP培养基中生长。合适的培养基从各种商业供应商处可获得或者可以根据公布的配方(例如,在美国典型培养物保藏中心的目录中)制备。Therefore, in some embodiments, a vector comprising a sequence encoding at least one variant uridine phosphorylase is transformed into a host cell to allow the propagation of the vector and the expression of the variant uridine phosphorylase. In some embodiments, the variant uridine phosphorylase is post-translationally modified to remove the signal peptide, and in some cases, can be cleaved after secretion. In some embodiments, the host cell of the conversion described above is cultivated in a suitable nutrient medium under conditions that allow the variant uridine phosphorylase to be expressed. Any suitable culture medium for cultivating host cells can be used in the present invention, including but not limited to minimal culture medium or composite culture medium containing suitable supplements. In some embodiments, host cells are grown in HTP culture medium. Suitable culture medium can be obtained from various commercial suppliers or can be prepared according to the formula (for example, in the catalog of the American Type Culture Collection) announced.

在另一方面,本发明提供了包含编码本文提供的改进的尿苷磷酸化酶多肽的多核苷酸的宿主细胞,所述多核苷酸可操作地连接至用于在宿主细胞中表达尿苷磷酸化酶的一个或更多个控制序列。用于表达由本发明的表达载体编码的尿苷磷酸化酶多肽的宿主细胞是本领域熟知的,并且包括但不限于,细菌细胞诸如大肠杆菌、巨大芽孢杆菌(Bacillusmegaterium)、开菲尔乳杆菌(Lactobacillus kefir)、链霉菌属和鼠伤寒沙门氏菌细胞;真菌细胞诸如酵母细胞(例如酿酒酵母或巴斯德毕赤酵母(ATCC登录号201178));昆虫细胞诸如果蝇属S2和夜蛾属Sf9细胞;动物细胞诸如CHO、COS、BHK、293和Bowes黑素瘤细胞;和植物细胞。用于上文描述的宿主细胞的适当的培养基和生长条件是本领域熟知的。On the other hand, the invention provides a host cell comprising a polynucleotide encoding an improved uridine phosphorylase polypeptide provided herein, the polynucleotide being operably connected to one or more control sequences for expressing uridine phosphorylase in a host cell. The host cell for expressing the uridine phosphorylase polypeptide encoded by the expression vector of the present invention is well known in the art, and includes but is not limited to bacterial cells such as Escherichia coli, Bacillus megaterium, Lactobacillus kefir, Streptomyces and Salmonella typhimurium cells; Fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae or Pichia pastoris (ATCC accession number 201178)); Insect cells such as Drosophila S2 and Spodoptera Sf9 cells; Animal cells such as CHO, COS, BHK, 293 and Bowes melanoma cells; And plant cells. Suitable culture medium and growth conditions for the host cells described above are well known in the art.

用于表达尿苷磷酸化酶的多核苷酸可以通过本领域已知的各种方法引入细胞中。技术包括电穿孔、生物弹射微粒轰击、脂质体介导的转染、氯化钙转染和原生质体融合,以及其他。用于将多核苷酸引入到细胞中的各种方法是本领域技术人员已知的。Polynucleotides for expressing uridine phosphorylase can be introduced into cells by various methods known in the art. Techniques include electroporation, biolistic particle bombardment, liposome-mediated transfection, calcium chloride transfection, and protoplast fusion, among others. Various methods for introducing polynucleotides into cells are known to those skilled in the art.

在一些实施方案中,宿主细胞是真核细胞。合适的真核宿主细胞包括但不限于真菌细胞、藻类细胞、昆虫细胞和植物细胞。合适的真菌宿主细胞包括但不限于子囊菌门(Ascomycota)、担子菌门(Basidiomycota)、半知菌门(Deuteromycota)、接合菌门(Zygomycota)、不完全菌纲(Fungi imperfecti)。在一些实施方案中,真菌宿主细胞是酵母细胞和丝状真菌细胞。本发明的丝状真菌宿主细胞包括真菌亚门(Eumycotina)和卵菌门(Oomycota)的所有丝状形式。丝状真菌的特征在于营养菌丝体,细胞壁由几丁质、纤维素和其他复合多糖组成。本发明的丝状真菌宿主细胞在形态学上不同于酵母。In some embodiments, the host cell is a eukaryotic cell. Suitable eukaryotic host cells include, but are not limited to, fungal cells, algae cells, insect cells, and plant cells. Suitable fungal host cells include, but are not limited to, Ascomycota, Basidiomycota, Deuteromycota, Zygomycota, and Fungi imperfecti. In some embodiments, the fungal host cell is a yeast cell and a filamentous fungal cell. The filamentous fungal host cell of the present invention includes all filamentous forms of Eumycotina and Oomycota. Filamentous fungi are characterized in that the vegetative mycelium, the cell wall is composed of chitin, cellulose, and other complex polysaccharides. The filamentous fungal host cell of the present invention is morphologically different from yeast.

在本发明的一些实施方案中,丝状真菌宿主细胞是任何合适的属和种,包括但不限于绵霉属(Achlya)、支顶孢属(Acremonium)、曲霉属(Aspergillus)、短梗霉属(Aureobasidium)、烟管菌属(Bjerkandera)、拟蜡孔菌属(Ceriporiopsis)、头孢霉属(Cephalosporium)、金孢属(Chrysosporium)、旋孢腔菌属(Cochliobolus)、棒囊壳属(Corynascus)、隐丛壳属(Cryphonectria)、隐球菌属(Cryptococcus)、鬼伞属(Coprinus)、革盖菌属(Coriolus)、色二孢属(Diplodia)、内座壳属(Endothia)、镰孢菌属(Fusarium)、赤霉属(Gibberella)、粘帚霉属(Gliocladium)、腐质霉属(Humicola)、肉座菌属(Hypocrea)、毁丝霉属(Myceliophthora)、毛霉属(Mucor)、脉孢菌属(Neurospora)、青霉属(Penicillium)、柄孢壳菌属(Podospora)、射脉菌属(Phlebia)、瘤胃壶菌属(Piromyces)、梨孢属(Pyricularia)、根毛霉属(Rhizomucor)、根霉属(Rhizopus)、裂褶菌属(Schizophyllum)、柱顶孢霉属(Scytalidium)、侧孢霉属(Sporotrichum)、篮状菌属(Talaromyces)、嗜热子囊菌属(Thermoascus)、梭孢壳属(Thielavia)、栓菌属(Trametes)、弯颈霉属(Tolypocladium)、木霉属(Trichoderma)、轮枝孢属(Verticillium)和/或小包脚菇属(Volvariella),和/或其有性型或无性型,及其异名、基名或分类学等同词。In some embodiments of the invention, the filamentous fungal host cell is of any suitable genus and species, including, but not limited to, Achlya, Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Cephalosporium, Chrysosporium, Cochliobolus, Corynascus, Cryphonectria, Cryptococcus, Coprinus, Coriolus, Diplodia, Endothia, Fusarium, Gibberella, Gliocladium, Humicola, Hypocrea ), Myceliophthora, Mucor, Neurospora, Penicillium, Podospora, Phlebia, Piromyces, Pyricularia, Rhizomucor, Rhizopus, Schizophyllum, Scytali The invention further comprises any of the genera Trichoderma, Sporotrichum, Talaromyces, Thermoascus, Thielavia, Trametes, Tolypocladium, Trichoderma, Verticillium and/or Volvariella, and/or their sexual or asexual forms, and their synonyms, baxinies or taxonomic equivalents.

在本发明的一些实施方案中,宿主细胞是酵母细胞,包括但不限于假丝酵母属(Candida)、汉逊酵母属(Hansenula)、酵母属(Saccharomyces)、裂殖酵母属(Schizosaccharomyces)、毕赤酵母属(Pichia)、克鲁维氏酵母属(Kluyveromyces)或耶氏酵母属(Yarrowia)物种的细胞。在本发明的一些实施方案中,酵母细胞是多形汉逊氏酵母(Hansenula polymorpha)、酿酒酵母(Saccharomyces cerevisiae)、卡尔斯伯酵母(Saccharomyces carlsbergensis)、糖化酵母(Saccharomyces diastaticus)、诺地酵母(Saccharomyces norbensis)、克鲁维酵母(Saccharomyces kluyveri)、粟酒裂殖酵母(Schizosaccharomyces pombe)、巴斯德毕赤酵母(Pichia pastoris)、芬兰毕赤酵母(Pichia finlandica)、喜海藻糖毕赤酵母(Pichia trehalophila)、库德毕赤酵母(Pichiakodamae)、膜醭毕赤酵母(Pichia membranaefaciens)、仙人掌毕赤酵母(Pichiaopuntiae)、耐热毕赤酵母(Pichia thermotolerans)、柳毕赤酵母(Pichia salictaria)、栋毕赤酵母(Pichia quercuum)、皮杰普氏毕赤酵母(Pichia pijperi)、树干毕赤酵母(Pichia stipitis)、甲醇毕赤酵母(Pichia methanolica)、安格斯毕赤酵母(Pichiaangusta)、乳酸克鲁维酵母(Kluyveromyces lactis)、白假丝酵母(Candida albicans)或解脂耶氏酵母(Yarrowia lipolytica)。In some embodiments of the invention, the host cell is a yeast cell, including but not limited to a cell of a Candida, Hansenula, Saccharomyces, Schizosaccharomyces, Pichia, Kluyveromyces, or Yarrowia species. In some embodiments of the invention, the yeast cell is Hansenula polymorpha, Saccharomyces cerevisiae, Saccharomyces carlsbergensis, Saccharomyces diastaticus, Saccharomyces norbensis, Saccharomyces kluyveri, Schizosaccharomyces pombe, Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia kudamae, Pichia membranaefaciens, Pichia cactus, Pichia thermotolerans, Pichia salictaria, Pichia somnifera, Pichia spp. quercuum, Pichia pijperi, Pichia stipitis, Pichia methanolica, Pichia angusta, Kluyveromyces lactis, Candida albicans, or Yarrowia lipolytica.

在本发明的一些实施方案中,宿主细胞是藻类细胞诸如衣藻属(Chlamydomonas)(例如,莱茵衣藻(C.reinhardtii))和席藻属(Phormidium)(席藻属种(P.sp.)ATCC29409)。In some embodiments of the invention, the host cell is an algal cell such as Chlamydomonas (eg, C. reinhardtii) and Phormidium (Phormidium sp. ATCC 29409).

在一些其他实施方案中,宿主细胞是原核细胞。合适的原核细胞包括但不限于革兰氏阳性、革兰氏阴性和革兰氏不定(Gram-variable)细菌细胞。任何合适的细菌生物体可用于本发明,包括但不限于农杆菌属(Agrobacterium)、脂环酸芽孢杆菌属(Alicyclobacillus)、鱼腥藻属(Anabaena)、倒囊藻属(Anacystis)、不动杆菌属(Acinetobacter)、热酸菌属(Acidothermus)、节杆菌属(Arthrobacter)、固氮菌属(Azobacter)、芽孢杆菌属(Bacillus)、双歧杆菌属(Bifidobacterium)、短杆菌属(Brevibacterium)、丁酸弧菌属(Butyrivibrio)、布赫纳氏菌属(Buchnera)、Campestris、弯曲杆菌属(Campylobacter)、梭菌属(Clostridium)、棒状杆菌属(Corynebacterium)、着色菌属(Chromatium)、粪球菌属(Coprococcus)、埃希氏菌属(Escherichia)、肠球菌属(Enterococcus)、肠杆菌属(Enterobacter)、欧文氏菌属(Erwinia)、梭杆菌属(Fusobacterium)、粪杆菌属(Faecalibacterium)、弗朗西斯氏菌属(Francisella)、黄杆菌属(Flavobacterium)、地芽孢杆菌属(Geobacillus)、嗜血杆菌属(Haemophilus)、螺杆菌属(Helicobacter)、克雷伯杆菌属(Klebsiella)、乳杆菌属(Lactobacillus)、乳球菌属(Lactococcus)、泥杆菌属(Ilyobacter)、微球菌属(Micrococcus)、微杆菌属(Microbacterium)、中慢生根瘤菌属(Mesorhizobium)、甲基杆菌属(Methylobacterium)、甲基杆菌属(Methylobacterium)、分枝杆菌属(Mycobacterium)、奈瑟菌属(Neisseria)、泛菌属(Pantoea)、假单胞菌属(Pseudomonas)、原绿球藻属(Prochlorococcus)、红细菌属(Rhodobacter)、红假单胞菌属(Rhodopseudomonas)、红假单胞菌属(Rhodopseudomonas)、罗斯氏菌属(Roseburia)、红螺菌属(Rhodospirillum)、红球菌属(Rhodococcus)、栅藻属(Scenedesmus)、链霉菌属(Streptomyces)、链球菌属(Streptococcus)、聚球藻属(Synechococcus)、糖单孢菌属(Saccharomonospora)、葡萄球菌属(Staphylococcus)、沙雷氏菌属(Serratia)、沙门氏菌属(Salmonella)、志贺氏菌属(Shigella)、热厌氧杆菌属(Thermoanaerobacterium)、养障体属(Tropheryma)、土拉菌属(Tularensis)、Temecula、嗜热聚球藻菌属(Thermosynechococcus)、热球菌属(Thermococcus)、脲原体属(Ureaplasma)、黄单胞菌属(Xanthomonas)、木杆菌属(Xylella)、耶尔森菌属(Yersinia)和发酵单胞菌属(Zymomonas)。在一些实施方案中,宿主细胞是农杆菌属、不动杆菌属、固氮菌属、芽孢杆菌属、双歧杆菌属、布赫纳氏菌属、地芽孢杆菌属、弯曲杆菌、梭菌属、棒状杆菌属、埃希氏菌属、肠球菌属、欧文氏菌属、黄杆菌属、乳杆菌属、乳球菌属、泛菌属、假单胞菌属、葡萄球菌属、沙门氏菌属、链球菌属、链霉菌属或发酵单胞菌属的物种。在一些实施方案中,细菌宿主菌株对人类是非致病性的。在一些实施方案中,细菌宿主菌株是工业菌株。许多细菌工业菌株是已知的并适于本发明。在本发明的一些实施方案中,细菌宿主细胞是农杆菌属的物种(例如,放射形农杆菌(A.radiobacter)、发根农杆菌(A.rhizogenes)和悬钩子农杆菌(A.rubi))。在本发明的一些实施方案中,细菌宿主细胞是节杆菌属的物种(例如,金黄色节杆菌(A.aurescens)、柠檬色节杆菌(A.citreus)、球形节杆菌(A.globiformis)、裂烃谷氨酸节杆菌(A.hydrocarboglutamicus)、迈索尔节杆菌(A.mysorens)、烟草节杆菌(A.nicotianae)、石蜡节杆菌(A.paraffineus)、畏光节杆菌(A.protophonniae)、玫瑰色石腊节杆菌(A.roseoparqffinus)、硫磺色节杆菌(A.sulfureus)和产脲节杆菌(A.ureafaciens))。在本发明的一些实施方案中,细菌宿主细胞是芽孢杆菌属的物种(例如,苏云金芽孢杆菌(B.thuringensis)、炭疽芽孢杆菌(B.anthracis)、巨大芽孢杆菌(B.megaterium)、枯草芽孢杆菌(B.subtilis)、缓慢芽孢杆菌(B.lentus)、环状芽孢杆菌(B.circulans)、短小芽孢杆菌(B.pumilus)、灿烂芽孢杆菌(B.lautus)、凝结芽孢杆菌(B.coagulans)、短芽孢杆菌(B.brevis)、坚强芽孢杆菌(B.firmus)、嗜碱芽孢杆菌(B.alkaophius)、地衣芽孢杆菌(B.licheniformis)、克劳氏芽孢杆菌(B.clausii)、嗜热脂肪芽孢杆菌(B.stearothermophilus)、耐盐芽孢杆菌(B.halodurans)和解淀粉芽孢杆菌(B.amyloliquefaciens))。在一些实施方案中,宿主细胞是工业芽孢杆菌菌株,包括但不限于枯草芽孢杆菌、短小芽孢杆菌、地衣芽孢杆菌、巨大芽孢杆菌、克劳氏芽孢杆菌、嗜热脂肪芽孢杆菌或解淀粉芽孢杆菌。在一些实施方案中,芽孢杆菌宿主细胞是枯草芽孢杆菌、地衣芽孢杆菌、巨大芽孢杆菌、嗜热脂肪芽孢杆菌和/或解淀粉芽孢杆菌。在一些实施方案中,细菌宿主细胞是梭菌属的物种(例如,丙酮丁醇梭菌(C.acetobutylicum)、破伤风梭菌(C.tetani)E88、象牙海岸梭菌(C.lituseburense)、糖丁酸梭菌(C.saccharobutylicum)、产气荚膜梭菌(C.perfringens)和拜氏梭菌(C.beijerinckii))。在一些实施方案中,细菌宿主细胞是棒状杆菌属的物种(例如,谷氨酸棒状杆菌(C.glutamicum)和嗜乙酰乙酸棒状杆菌(C.acetoacidophilum))。在一些实施方案中,细菌宿主细胞是埃希氏菌属的物种(例如,大肠杆菌)。在一些实施方案中,宿主细胞是大肠杆菌W3110。在一些实施方案中,细菌宿主细胞是欧文氏菌属的物种(例如,噬夏孢欧文氏菌(E.uredovora)、胡萝卜软腐欧文氏菌(E.carotovora)、菠萝欧文氏菌(E.ananas)、草生欧文氏菌(E.herbicola)、斑点欧文氏菌(E.punctata)和土欧文氏菌(E.terreus))。在一些实施方案中,细菌宿主细胞是泛菌属的物种(例如,柠檬酸泛菌(P.citrea)和成团泛菌(P.agglomerans))。在一些实施方案中,细菌宿主细胞是假单胞菌属的物种(例如,恶臭假单胞菌(P.putida)、铜绿假单胞菌(P.aeruginosa)、迈氏假单胞菌(P.mevalonii)和P.sp.D-0l 10)。在一些实施方案中,细菌宿主细胞是链球菌属的物种(例如,马链球菌(S.equisimiles)、化脓性链球菌(S.pyogenes)和乳房链球菌(S.uberis))。在一些实施方案中,细菌宿主细胞是链霉菌属的物种(例如,生二素链霉菌(S.ambofaciens)、不产色链霉菌(S.achromogenes)、阿维链霉菌(S.avermitilis)、天蓝色链霉菌(S.coelicolor)、生金色链霉菌(S.aureofaciens)、金色链霉菌(S.aureus)、杀真菌素链霉菌(S.fungicidicus)、灰色链霉菌(S.griseus)和变铅青链霉菌(S.lividans))。在一些实施方案中,细菌宿主细胞是发酵单胞菌属的物种(例如,运动发酵单胞菌(Z.mobilis)和解脂发酵单胞菌(Z.lipolytica))。In some other embodiments, the host cell is a prokaryotic cell. Suitable prokaryotic cells include but are not limited to Gram-positive, Gram-negative and Gram-variable bacterial cells. Any suitable bacterial organism can be used for the present invention, including but not limited to Agrobacterium, Alicyclobacillus, Anabaena, Anacystis, Acinetobacter, Acidothermus, Arthrobacter, Azobacter, Bacillus, Bifidobacterium, Brevibacterium, Butyrivibrio, Buchnera, Campestris, Campylobacter, Clostridium, Corynebacterium ), Chromatium, Coprococcus, Escherichia, Enterococcus, Enterobacter, Erwinia, Fusobacterium, Faecalibacterium, Francisella, Flavobacterium, Geobacillus, Haemophilus, Helicobacter, Klebsiella, Lactobacillus, Lactococcus, Ilyobacter, Micrococcus , Microbacterium, Mesorhizobium, Methylobacterium, Methylobacterium, Mycobacterium, Neisseria, Pantoea, Pseudomonas, Prochlorococcus, Rhodobacter, Rhodopseudomonas, Rhodopseudomonas, Roseburia, Rhodospirillum, Rhodococcus, Scenedesmus, Streptomyces ), Streptococcus, Synechococcus, Saccharomonospora, Staphylococcus, Serratia, Salmonella, Shigella, Thermoanaerobacterium, Tropheryma, Tularensis, Temecula, Thermosynechococcus, Thermococcus, Ureaplasma, Xanthomonas, Xylella, Yersinia, and Zymomonas. In some embodiments, the host cell is a species of Agrobacterium, Acinetobacter, Azotobacter, Bacillus, Bifidobacterium, Buchnera, Geobacillus, Campylobacter, Clostridium, Corynebacterium, Escherichia, Enterococcus, Erwinia, Flavobacterium, Lactobacillus, Lactococcus, Pantoea, Pseudomonas, Staphylococcus, Salmonella, Streptococcus, Streptomyces or Zymomonas. In some embodiments, the bacterial host strain is non-pathogenic to humans. In some embodiments, the bacterial host strain is an industrial strain. Many bacterial industrial strains are known and suitable for the present invention. In some embodiments of the present invention, the bacterial host cell is a species of Agrobacterium (e.g., A. radiobacter, A. rhizogenes and A. rubi). In some embodiments of the invention, the bacterial host cell is a species of the genus Arthrobacter (e.g., A. aurescens, A. citreus, A. globiformis, A. hydrocarboglutamicus, A. mysorens, A. nicotianae, A. paraffineus, A. protophonniae, A. roseoparqffinus, A. sulfurus, and A. ureafaciens). In some embodiments of the invention, the bacterial host cell is a species of the genus Bacillus (e.g., B. thuringensis, B. anthracis, B. megaterium, B. subtilis, B. lentus, B. circulans, B. pumilus, B. lautus, B. coagulans, B. brevis, B. firmus, B. alkaophius, B. licheniformis, B. clausii, B. stearothermophilus, B. halodurans, and B. amyloliquefaciens). In some embodiments, the host cell is an industrial bacillus strain, including but not limited to bacillus subtilis, bacillus pumilus, bacillus licheniformis, bacillus megaterium, bacillus clausii, bacillus stearothermophilus or bacillus amyloliquefaciens. In some embodiments, the bacillus host cell is bacillus subtilis, bacillus licheniformis, bacillus megaterium, bacillus stearothermophilus and/or bacillus amyloliquefaciens. In some embodiments, the bacterial host cell is a species of the genus Clostridium (e.g., acetobutylicum, Clostridium tetani E88, C. lituseburense, C. saccharobutylicum, C. perfringens and C. beijerinckii). In some embodiments, the bacterial host cell is a species of the genus Corynebacterium (e.g., Corynebacterium glutamicum (C.glutamicum) and Corynebacterium acetoacetic acid (C.acetoacidophilum)). In some embodiments, the bacterial host cell is a species of the genus Escherichia (e.g., Escherichia coli). In some embodiments, the host cell is Escherichia coli W3110. In some embodiments, the bacterial host cell is a species of the genus Erwinia (e.g., Erwinia uredovora (E.uredovora), Erwinia carotovora (E.carotovora), Erwinia ananas (E.ananas), Erwinia herbicola (E.herbicola), Erwinia punctata (E.punctata) and Erwinia terreus (E.terreus)). In some embodiments, the bacterial host cell is a species of the genus Pantoea (e.g., Pantoea citrea (P.citrea) and Pantoea agglomerans (P.agglomerans)). In some embodiments, the bacterial host cell is a species of Pseudomonas (e.g., Pseudomonas putida, Pseudomonas aeruginosa, Pseudomonas mevalonii, and P.sp. D-0110). In some embodiments, the bacterial host cell is a species of Streptococcus (e.g., Streptococcus equisimiles, Streptococcus pyogenes, and Streptococcus uberis). In some embodiments, the bacterial host cell is a species of the genus Streptomyces (e.g., S. ambofaciens, S. achromogenes, S. avermitilis, S. coelicolor, S. aureofaciens, S. aureus, S. fungicidicus, S. griseus, and S. lividans). In some embodiments, the bacterial host cell is a species of the genus Zymomonas (e.g., Zymomonas mobilis and Zymomonas lipolytica).

可用于本发明的许多原核和真核菌株是公众可从许多培养物保藏中心容易地获得的,例如美国典型培养物保藏中心(ATCC)、德国微生物和真菌保藏中心(DeutscheSammlung von Mikroorganismen und Zellkulturen GmbH,DSM)、荷兰中央农业研究中心(Centraalbureau Voor Schimmelcultures,CBS)和美国农业研究服务专利培养物北方区域研究中心(Agricultural Research Service Patent Culture Collection,NorthernRegional Research Center,NRRL)。Many prokaryotic and eukaryotic strains that can be used in the present invention are readily available to the public from a number of culture collections, such as the American Type Culture Collection (ATCC), the German Collection of Microorganisms and Fungi (Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, DSM), the Netherlands Central Agricultural Research Center (Centraalbureau Voor Schimmelcultures, CBS), and the U.S. Agricultural Research Service Patent Culture Collection, Northern Regional Research Center (NRRL).

在一些实施方案中,宿主细胞被遗传修饰以具有改进蛋白分泌、蛋白稳定性和/或对于蛋白表达和/或分泌期望的其他性质的特性。遗传修饰可以通过遗传工程技术和/或传统的微生物学技术(例如,化学或UV诱变和后续选择)来实现。实际上,在一些实施方案中,重组修饰和经典选择技术的组合用于产生宿主细胞。利用重组技术,核酸分子可以以导致宿主细胞内和/或培养基中尿苷磷酸化酶变体产量增加的方式被引入、缺失、抑制或修饰。例如,Alp1功能的敲除导致蛋白酶缺陷的细胞,并且pyr5功能的敲除导致具有嘧啶缺陷表型的细胞。在一种遗传工程方法中,同源重组被用于通过在体内特异性靶向基因来诱导靶向基因修饰以抑制所编码的蛋白的表达。在可选的方法中,siRNA、反义和/或核酶技术可用于抑制基因表达。用于减少细胞中蛋白的表达包括但不限于缺失编码该蛋白的全部或部分基因和位点特异性诱变以破坏该基因产物的表达或活性的各种方法,是本领域已知的。(参见例如,Chaveroche等人,Nucl.Acids Res.,28:22e97[2000];Cho等人,Molec.PlantMicrobe Interact.,19:7-15[2006];Maruyama和Kitamoto,Biotechnol Lett.,30:1811-1817[2008];Takahashi等人,Mol.Gen.Genom.,272:344-352[2004];和You等人,Arch.Microbiol.,191:615-622[2009],所有这些通过引用并入本文)。随机诱变然后筛选期望的突变也是有用的(参见例如,Combier等人,FEMS Microbiol.Lett.,220:141-8[2003];和Firon等人,Eukary.Cell 2:247-55[2003],其中二者均通过引用并入)。In some embodiments, host cells are genetically modified to have improved protein secretion, protein stability and/or other properties desired for protein expression and/or secretion.Genetic modification can be achieved by genetic engineering techniques and/or traditional microbiological techniques (for example, chemical or UV mutagenesis and subsequent selection). In fact, in some embodiments, the combination of recombinant modification and classical selection techniques is used to produce host cells.Utilizing recombinant technology, nucleic acid molecules can be introduced, lacked, suppressed or modified in a manner that causes an increase in uridine phosphorylase variant output in the host cell and/or in the culture medium.For example, the knockout of Alp1 function causes a cell with protease defect, and the knockout of pyr5 function causes a cell with a pyrimidine defect phenotype.In a genetic engineering method, homologous recombination is used to induce targeted gene modification to suppress the expression of encoded protein by specific targeting genes in vivo.In an optional method, siRNA, antisense and/or ribozyme technology can be used to inhibit gene expression.It is known in the art to include but not limited to the expression of all or part of the genes encoding the protein and site-specific mutagenesis to destroy the expression or active various methods of the gene product for reducing the expression of the protein in the cell. (See, e.g., Chaveroche et al., Nucl. Acids Res., 28:22e97 [2000]; Cho et al., Molec. Plant Microbe Interact., 19:7-15 [2006]; Maruyama and Kitamoto, Biotechnol Lett., 30:1811-1817 [2008]; Takahashi et al., Mol. Gen. Genom., 272:344-352 [2004]; and You et al., Arch. Microbiol., 191:615-622 [2009], all of which are incorporated herein by reference). Random mutagenesis followed by screening for desired mutations is also useful (See, e.g., Combier et al., FEMS Microbiol. Lett., 220: 141-8 [2003]; and Firon et al., Eukary. Cell 2:247-55 [2003], both of which are incorporated by reference).

将载体或DNA构建体引入宿主细胞可以使用本领域已知的任何合适方法来完成,包括但不限于磷酸钙转染、DEAE-葡聚糖介导的转染、PEG介导的转化、电穿孔或本领域已知的其他常规技术。在一些实施方案中,可使用大肠杆菌表达载体pCK100900i(参见美国专利第9,714,437号,其特此通过引用并入)。The introduction of a vector or DNA construct into a host cell can be accomplished using any suitable method known in the art, including but not limited to calcium phosphate transfection, DEAE-dextran mediated transfection, PEG-mediated transformation, electroporation, or other conventional techniques known in the art. In some embodiments, the E. coli expression vector pCK100900i (see U.S. Pat. No. 9,714,437, which is hereby incorporated by reference) can be used.

在一些实施方案中,将本发明的工程化宿主细胞(即“重组宿主细胞”)在经适当修改以激活启动子、选择转化子或扩增尿苷磷酸化酶多核苷酸的常规营养培养基中培养。培养条件,诸如温度、pH等,是对于选择用于表达的宿主细胞先前使用的那些,并且是本领域技术人员所熟知的。如所述的,对于许多细胞(包括细菌、植物、动物(特别是哺乳动物)和古细菌来源的细胞)的培养和产生,许多标准参考文献和教科书是可用的。In some embodiments, the engineered host cells of the invention (i.e., "recombinant host cells") are cultured in conventional nutrient media modified appropriately to activate promoters, select transformants, or amplify uridine phosphorylase polynucleotides. Culture conditions, such as temperature, pH, etc., are those previously used for the host cell selected for expression and are well known to those skilled in the art. As described, many standard references and textbooks are available for the culture and production of many cells, including cells of bacterial, plant, animal (particularly mammalian) and archaeal origin.

在一些实施方案中,将表达本发明的变体尿苷磷酸化酶多肽的细胞在分批发酵或连续发酵条件下生长。经典的“分批发酵”是封闭的系统,其中培养基的组成在发酵开始时就被设定并且在发酵期间不经历人为改变。分批系统的一种变化形式是“补料分批发酵”,其也可用于本发明。在这种变化形式中,底物随着发酵的进行而增量添加。当分解代谢物抑制可能会抑制细胞的新陈代谢并且期望在培养基中具有有限量的底物时,补料分批系统是有用的。分批发酵和补料分批发酵是本领域常规且熟知的。“连续发酵”是开放的系统,其中将限定的发酵培养基连续添加到生物反应器中,并同时去除等量的条件培养基进行处理。连续发酵通常将培养物保持在恒定的高密度,其中细胞主要处于对数生长阶段。连续发酵系统努力保持稳定状态生长条件。用于调节连续发酵过程的营养物质和生长因子的方法以及使产物形成速率最大化的技术是在工业微生物学领域熟知的。In some embodiments, cells expressing variant uridine phosphorylase polypeptides of the present invention are grown under batch fermentation or continuous fermentation conditions. Classical "batch fermentation" is a closed system in which the composition of the culture medium is set at the beginning of the fermentation and is not subject to artificial changes during the fermentation. A variation of the batch system is "fed-batch fermentation", which can also be used in the present invention. In this variation, the substrate is added incrementally as the fermentation proceeds. Fed-batch systems are useful when catabolite inhibition may inhibit the metabolism of the cell and it is desirable to have a limited amount of substrate in the culture medium. Batch fermentation and fed-batch fermentation are conventional and well known in the art. "Continuous fermentation" is an open system in which a defined fermentation medium is continuously added to a bioreactor and an equal amount of conditioned medium is removed for processing. Continuous fermentation generally maintains the culture at a constant high density, in which the cells are primarily in the logarithmic growth phase. Continuous fermentation systems strive to maintain steady-state growth conditions. Methods for regulating nutrients and growth factors in continuous fermentation processes and techniques for maximizing product formation rates are well known in the field of industrial microbiology.

在本发明的一些实施方案中,无细胞转录/翻译系统可用于产生一种或更多种变体尿苷磷酸化酶。若干系统是商购可得的,并且方法是本领域技术人员熟知的。In some embodiments of the invention, a cell-free transcription/translation system can be used to produce one or more variant uridine phosphorylases. Several systems are commercially available and methods are well known to those skilled in the art.

本发明提供了制备变体尿苷磷酸化酶多肽或其生物活性片段的方法。在一些实施方案中,所述方法包括:提供用编码以下氨基酸序列的多核苷酸转化的宿主细胞,所述氨基酸序列包括与SEQ ID NO:2、SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID No:868的至少约70%(或至少约75%、至少约80%、至少约85%、至少约90%、至少约95%、至少约96%、至少约97%、至少约98%、或至少约99%)的序列同一性,并且包含本文提供的至少一种突变;在培养基中在宿主细胞表达编码的变体尿苷磷酸化酶多肽的条件下培养转化的宿主细胞;和任选地回收或分离表达的变体尿苷核苷磷酸化酶多肽,和/或回收或分离包含表达的变体尿苷核苷磷酸化酶多肽的培养基。在一些实施方案中,方法还提供任选地在表达编码的尿苷磷酸化酶多肽后裂解转化的宿主细胞,并任选地从细胞裂解物回收和/或分离表达的变体尿苷磷酸化酶多肽。本发明还提供了制备变体尿苷磷酸化酶多肽的方法,所述方法包括在适于产生变体尿苷磷酸化酶多肽的条件下培养用变体尿苷磷酸化酶多肽转化的宿主细胞,并回收变体尿苷磷酸化酶多肽。通常,使用本领域熟知的蛋白回收技术,包括本文描述的那些技术,从宿主细胞培养基、宿主细胞或两者回收或分离尿苷磷酸化酶多肽。在一些实施方案中,将宿主细胞通过离心收获,通过物理或化学手段破碎,并将所得粗提取物保留用于进一步纯化。在蛋白的表达中使用的微生物细胞可以通过任何常规方法破碎,包括但不限于冻融循环、声处理、机械破碎和/或使用细胞裂解剂,以及本领域技术人员熟知的许多其他合适的方法。The present invention provides a method for preparing a variant uridine phosphorylase polypeptide or a biologically active fragment thereof. In some embodiments, the method comprises: providing a host cell transformed with a polynucleotide encoding the following amino acid sequence, the amino acid sequence comprising at least about 70% (or at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%) sequence identity with SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 246, SEQ ID NO: 594, SEQ ID NO: 776 and/or SEQ ID No: 868, and comprising at least one mutation provided herein; culturing the transformed host cell in a culture medium under conditions in which the host cell expresses the encoded variant uridine phosphorylase polypeptide; and optionally recovering or isolating the expressed variant uridine nucleoside phosphorylase polypeptide, and/or recovering or isolating the culture medium containing the expressed variant uridine nucleoside phosphorylase polypeptide. In some embodiments, the method also provides optionally lysing the transformed host cell after expressing the encoded uridine phosphorylase polypeptide, and optionally recovering and/or isolating the expressed variant uridine phosphorylase polypeptide from the cell lysate. The present invention also provides a method for preparing a variant uridine phosphorylase polypeptide, the method comprising culturing a host cell transformed with a variant uridine phosphorylase polypeptide under conditions suitable for producing a variant uridine phosphorylase polypeptide, and recovering the variant uridine phosphorylase polypeptide. Typically, the uridine phosphorylase polypeptide is recovered or isolated from the host cell culture medium, the host cell, or both using protein recovery techniques well known in the art, including those described herein. In some embodiments, the host cells are harvested by centrifugation, broken by physical or chemical means, and the resulting crude extract is retained for further purification. The microbial cells used in the expression of the protein can be broken by any conventional method, including but not limited to freeze-thaw cycles, sonication, mechanical disruption, and/or the use of cell lysis agents, as well as many other suitable methods well known to those skilled in the art.

在宿主细胞中表达的工程化尿苷磷酸化酶可以使用用于蛋白纯化的本领域已知的技术中的任何一种或更多种从细胞和/或培养基回收,除了其他以外包括,溶菌酶处理、声处理、过滤、盐析、超离心和色谱法。适于从细菌(诸如大肠杆菌)裂解和高效提取蛋白的溶液是以商品名CelLytic BTM(Sigma-Aldrich)商业可得的。因此,在一些实施方案中,所得的多肽被回收/分离,并任选地通过本领域已知的许多方法中的任一种来纯化。例如,在一些实施方案中,通过常规方法从营养培养基分离多肽,所述常规方法包括但不限于离心、过滤、萃取、喷雾干燥、蒸发、色谱法(例如,离子交换、亲和、疏水相互作用、色谱聚焦和尺寸排阻)或沉淀。在一些实施方案中,如期望的,在完成成熟蛋白的构型中使用蛋白重折叠步骤。此外,在一些实施方案中,在最终纯化步骤中采用高效液相色谱(HPLC)。例如,在一些实施方案中,本领域中已知的方法可用于本发明(参见例如,Parry等人,Biochem.J.,353:117[2001];和Hong等人,Appl.Microbiol.Biotechnol.,73:1331[2007],其二者均通过引用并入本文)。实际上,本领域中已知的任何合适的纯化方法都可用于本发明。The engineered uridine phosphorylase expressed in the host cell can be recovered from cells and/or culture medium using any one or more of the techniques known in the art for protein purification, including, among others, lysozyme treatment, sonication, filtration, salting out, ultracentrifugation and chromatography. Suitable for cracking and efficiently extracting proteins from bacteria (such as Escherichia coli) is commercially available under the trade name CelLytic B TM (Sigma-Aldrich). Therefore, in some embodiments, the resulting polypeptide is recovered/separated, and optionally purified by any of many methods known in the art. For example, in some embodiments, polypeptide is separated from a nutrient medium by conventional methods, including but not limited to centrifugation, filtration, extraction, spray drying, evaporation, chromatography (e.g., ion exchange, affinity, hydrophobic interaction, chromatofocusing and size exclusion) or precipitation. In some embodiments, as desired, a protein refolding step is used in the configuration of the mature protein. In addition, in some embodiments, high performance liquid chromatography (HPLC) is adopted in the final purification step. For example, in some embodiments, methods known in the art can be used in the present invention (see, e.g., Parry et al., Biochem. J., 353: 117 [2001]; and Hong et al., Appl. Microbiol. Biotechnol., 73: 1331 [2007], both of which are incorporated herein by reference). In fact, any suitable purification method known in the art can be used in the present invention.

用于分离尿苷磷酸化酶多肽的色谱技术包括但不限于,反相色谱、高效液相色谱、离子交换色谱、凝胶电泳和亲和色谱。用于纯化特定酶的条件将部分地取决于诸如净电荷、疏水性、亲水性、分子量、分子形状等因素,这些因素是本领域技术人员已知的。Chromatographic techniques used to separate uridine phosphorylase polypeptides include, but are not limited to, reverse phase chromatography, high performance liquid chromatography, ion exchange chromatography, gel electrophoresis, and affinity chromatography. The conditions used to purify a particular enzyme will depend in part on factors such as net charge, hydrophobicity, hydrophilicity, molecular weight, molecular shape, etc., which are known to those skilled in the art.

在一些实施方案中,亲和技术可用于分离改进的尿苷磷酸化酶。对于亲和色谱纯化,可以使用与尿苷磷酸化酶多肽特异性结合的任何抗体。为了产生抗体,可以通过注射尿苷磷酸化酶免疫接种各种宿主动物,包括但不限于兔、小鼠、大鼠等。尿苷磷酸化酶多肽可以借助于侧链官能团或附接至侧链官能团的接头附接至合适的载体诸如BSA。根据宿主物种,可以使用各种佐剂增强免疫应答,包括但不限于弗氏(完全和不完全)、矿物凝胶诸如氢氧化铝、表面活性物质诸如溶血卵磷脂、普朗尼克多元醇、聚阴离子、肽、油乳剂、钥孔血蓝蛋白(keyhole limpet hemocyanin)、二硝基苯酚,以及潜在有用的人类佐剂诸如BCG(卡介苗)和短棒状杆菌(Corynebacterium parvum)。In some embodiments, affinity techniques can be used to isolate improved uridine phosphorylase. For affinity chromatography purification, any antibody that specifically binds to uridine phosphorylase polypeptides can be used. In order to produce antibodies, various host animals can be immunized by injection of uridine phosphorylase, including but not limited to rabbits, mice, rats, etc. The uridine phosphorylase polypeptide can be attached to a suitable carrier such as BSA by means of side chain functional groups or linkers attached to side chain functional groups. Depending on the host species, various adjuvants can be used to enhance the immune response, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surfactants such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpets. keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants such as BCG (bacillus Calmette-Guérin) and Corynebacterium parvum.

在一些实施方案中,制备尿苷磷酸化酶变体并以表达酶的细胞形式、作为粗提取物或作为分离或纯化的制品使用。在一些实施方案中,将尿苷磷酸化酶变体制备成冻干剂、粉末形式(例如丙酮粉末),或者制备为酶溶液。在一些实施方案中,尿苷磷酸化酶变体是基本上纯的制品形式。In some embodiments, the uridine phosphorylase variant is prepared and used in the form of cells expressing the enzyme, as a crude extract, or as an isolated or purified product. In some embodiments, the uridine phosphorylase variant is prepared as a lyophilized agent, a powder form (e.g., acetone powder), or as an enzyme solution. In some embodiments, the uridine phosphorylase variant is in the form of a substantially pure product.

在一些实施方案中,尿苷磷酸化酶多肽连接到任何合适的固体基底。固体基底包括但不限于固相、表面和/或膜。固体支持物包括但不限于有机聚合物诸如聚苯乙烯、聚乙烯、聚丙烯、聚氟乙烯、聚氧乙烯(polyethyleneoxy)和聚丙烯酰胺以及它们的共聚物和接枝物。固体支持物还可以是无机的,诸如玻璃、二氧化硅、可控孔隙玻璃(CPG)、反相二氧化硅或金属诸如金或铂。基底的构型可以呈珠、球、微粒、颗粒、凝胶、膜或表面的形式。表面可以是平面的、基本上平面的或非平面的。固体支持物可以是多孔的或无孔的,并且可以具有溶胀或非溶胀特征。固体支持物可以被配置为孔、凹陷或其他容器(container)、器皿(vessel)、特征或位置的形式。多于一种支持物可以被配置在阵列上的多个位置处,所述多个位置是试剂的自动递送或通过检测方法和/或仪器可寻址的。In some embodiments, the uridine phosphorylase polypeptide is connected to any suitable solid substrate. Solid substrates include but are not limited to solid phases, surfaces and/or membranes. Solid supports include but are not limited to organic polymers such as polystyrene, polyethylene, polypropylene, polyfluoroethylene, polyethyleneoxy and polyacrylamide and their copolymers and grafts. Solid supports can also be inorganic, such as glass, silica, controlled pore glass (CPG), reversed silica or metals such as gold or platinum. The configuration of the substrate can be in the form of beads, balls, microparticles, particles, gels, membranes or surfaces. The surface can be planar, substantially planar or non-planar. The solid support can be porous or non-porous and can have swelling or non-swelling characteristics. The solid support can be configured in the form of a hole, a depression or other container, a vessel, a feature or a position. More than one support can be configured at multiple positions on the array, and the multiple positions are automatic delivery of reagents or addressable by detection methods and/or instruments.

在一些实施方案中,免疫学方法用于纯化尿苷磷酸化酶变体。在一种方法中,使用常规方法针对野生型或变体尿苷磷酸化酶多肽(例如,针对包含SEQ ID NO:2、SEQ ID NO:4、SEQ ID NO:246、SEQ ID NO:594、SEQ ID NO:776和/或SEQ ID NO:868中任一个的多肽和/或其变体和/或其免疫原性片段)产生的抗体被固定在珠上,在其中变体尿苷磷酸化酶被结合的条件下与细胞培养基混合,并沉淀。在相关的方法中,可使用免疫色谱法。In some embodiments, immunological methods are used to purify uridine phosphorylase variants. In one method, antibodies generated using conventional methods against wild-type or variant uridine phosphorylase polypeptides (e.g., against polypeptides comprising any of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:246, SEQ ID NO:594, SEQ ID NO:776 and/or SEQ ID NO:868 and/or variants thereof and/or immunogenic fragments thereof) are fixed to beads, mixed with cell culture medium under conditions where variant uridine phosphorylase is bound, and precipitated. In a related method, immunochromatography can be used.

在一些实施方案中,变体尿苷磷酸化酶表达为包含非酶部分的融合蛋白。在一些实施方案中,变体尿苷磷酸化酶序列与纯化促进结构域融合。如本文使用的,术语“纯化促进结构域”是指介导与其融合的多肽的纯化的结构域。合适的纯化结构域包括但不限于金属螯合肽、允许在固定化金属上纯化的组氨酸-色氨酸模块、结合谷胱甘肽的序列(例如,GST)、血凝素(HA)标签(对应于源自流感血凝素蛋白的表位;参见例如Wilson等人,Cell37:767[1984])、麦芽糖结合蛋白序列、在FLAGS延伸/亲和纯化系统(例如,从Immunex Corp可得的系统)中利用的FLAG表位,等等。设想用于本文描述的组合物和方法的一种表达载体提供融合蛋白的表达,该融合蛋白包含与被肠激酶裂解位点分开的多组氨酸区融合的本发明多肽。组氨酸残基促进IMIAC(固定化金属离子亲和色谱;例如,见Porath等人,Prot.Exp.Purif.,3:263-281[1992])上的纯化而肠激酶裂解位点为将变体尿苷磷酸化酶多肽从融合蛋白分离提供方法。pGEX载体(Promega)也可用于将外来多肽表达为与谷胱甘肽S-转移酶(GST)的融合蛋白。一般来说,这样的融合蛋白是可溶性的,并且可以通过吸附到配体-琼脂糖珠(例如,在GST-融合的情况下是谷胱甘肽-琼脂糖)而容易地从裂解的细胞中纯化,然后在存在游离配体的情况下洗脱。In some embodiments, variant uridine phosphorylase is expressed as a fusion protein comprising a non-enzyme portion. In some embodiments, the variant uridine phosphorylase sequence is fused to a purification facilitating domain. As used herein, the term "purification facilitating domain" refers to a domain that mediates the purification of a polypeptide fused thereto. Suitable purification domains include, but are not limited to, metal chelating peptides, histidine-tryptophan modules that allow purification on immobilized metals, sequences that bind glutathione (e.g., GST), hemagglutinin (HA) tags (corresponding to epitopes derived from influenza hemagglutinin proteins; see, e.g., Wilson et al., Cell 37: 767 [1984]), maltose binding protein sequences, FLAG epitopes utilized in FLAGS extension/affinity purification systems (e.g., systems available from Immunex Corp), etc. It is contemplated that an expression vector for the compositions and methods described herein provides expression of a fusion protein comprising a polypeptide of the present invention fused to a polyhistidine region separated by an enterokinase cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion affinity chromatography; e.g., see Porath et al., Prot. Exp. Purif., 3:263-281 [1992]), while the enterokinase cleavage site provides a means for separating the variant uridine phosphorylase polypeptide from the fusion protein. The pGEX vector (Promega) can also be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In general, such fusion proteins are soluble and can be easily purified from lysed cells by adsorption to ligand-agarose beads (e.g., glutathione-agarose in the case of GST-fusions), followed by elution in the presence of free ligand.

因此,在另一个方面,本发明提供了产生工程化酶多肽的方法,其中该方法包括在适于多肽表达的条件下培养能够表达编码工程化酶多肽的多核苷酸的宿主细胞。在一些实施方案中,该方法还包括分离和/或纯化如本文描述的酶多肽的步骤。Therefore, in another aspect, the invention provides a method for producing an engineered enzyme polypeptide, wherein the method includes culturing a host cell capable of expressing a polynucleotide encoding an engineered enzyme polypeptide under conditions suitable for polypeptide expression. In some embodiments, the method also includes the step of separating and/or purifying the enzyme polypeptide as described herein.

用于宿主细胞的适当的培养基和生长条件是本领域熟知的。设想,将用于表达酶多肽的多核苷酸引入细胞的任何合适方法都可用于本发明。合适的技术包括但不限于电穿孔、生物颗粒轰击法、脂质体介导的转染、氯化钙转染和原生质体融合。Suitable culture media and growth conditions for host cells are well known in the art. It is contemplated that any suitable method for introducing a polynucleotide for expressing an enzyme polypeptide into a cell can be used in the present invention. Suitable techniques include, but are not limited to, electroporation, bioparticle bombardment, liposome-mediated transfection, calcium chloride transfection, and protoplast fusion.

本发明的多种特征和实施方案在以下代表性实施例中进行了说明,这些实施例旨在说明而非限制。Various features and embodiments of the invention are illustrated in the following representative examples, which are intended to be illustrative rather than limiting.

实验experiment

提供以下实施例,包括实验和获得的结果,仅用于说明的目的,而不应被解释为限制本发明。事实上,下文描述的许多试剂和设备有各种合适的来源。本发明并不意图限于任何试剂或设备项目的任何特定来源。The following examples are provided, including the results of experiments and acquisitions, for illustrative purposes only and should not be construed as limiting the present invention. In fact, many reagents and equipment described below have various suitable sources. The present invention is not intended to be limited to any particular source of any reagent or equipment item.

在以下的实验公开内容中,应用以下缩写:M(摩尔/升);mM(毫摩尔/升)、uM和μΜ(微摩尔/升);nM(纳摩尔/升);mol(摩尔);gm和g(克);mg(毫克);ug和μg(微克);L和l(升);ml和mL(毫升);cm(厘米);mm(毫米);um和μm(微米);sec.(秒);min(分钟);h和hr(小时);U(单位);MW(分子量);rpm(每分钟转数);psi和PSI(每平方英寸磅数);℃(摄氏度);RT和rt(室温);CV(变异系数);CAM和cam(氯霉素);PMBS(硫酸多粘菌素B);IPTG(异丙基β-D-L-硫代半乳糖吡喃糖苷);LB(溶原性肉汤);TB(terrific肉汤);SFP(摇瓶粉末);CDS(编码序列);DNA(脱氧核糖核酸);RNA(核糖核酸);nt(核苷酸;多核苷酸);aa(氨基酸;多肽);大肠杆菌W3110(常用实验室大肠杆菌菌株,从Coli Genetic Stock Center)[CGSC],NewHaven,CT可得);HTP(高通量);HPLC(高压液相色谱);HPLC-UV(HPLC-紫外可见检测器);1HNMR(质子核磁共振波谱);FIOPC(对阳性对照的改进倍数);Sigma和Sigma-Aldrich(Sigma-Aldrich,St.Louis,MO);Difco(Difco Laboratories,BD Diagnostic Systems,Detroit,MI);Microfluidics(Microfluidics,Westwood,MA);Life Technologies(LifeTechnologies,Fisher Scientific,Waltham,MA的一部分);Amresco(Amresco,LLC,Solon,OH);Carbosynth(Carbosynth,Ltd.,Berkshire,UK);Varian(Varian Medical Systems,Palo Alto,CA);Agilent(Agilent Technologies,Inc.,Santa Clara,CA);Infors(InforsUSA Inc.,Annapolis Junction,MD);和Thermotron(Thermotron,Inc.,Holland,MI)。In the following experimental disclosure, the following abbreviations apply: M (mole/liter); mM (millimol/liter), uM and μΜ (micromol/liter); nM (nanomoles/liter); mol (mole); gm and g (gram); mg (milligram); ug and μg (microgram); L and l (liter); ml and mL (milliliter); cm (centimeter); mm (millimeter); um and μm (micrometer); sec. (second); min (minute); h and hr (hour); U (unit); MW (molecular weight); rpm (revolutions per minute); psi and PSI (pounds per square inch) number); °C (degrees Celsius); RT and rt (room temperature); CV (coefficient of variation); CAM and cam (chloramphenicol); PMBS (polymyxin B sulfate); IPTG (isopropyl β-D-L-thiogalactopyranoside); LB (lysogenic broth); TB (terrific broth); SFP (shake flask powder); CDS (coding sequence); DNA (deoxyribonucleic acid); RNA (ribonucleic acid); nt (nucleotide; polynucleotide); aa (amino acid; polypeptide); Escherichia coli W3110 (a common laboratory Escherichia coli strain, obtained from Coli =The results are as follows: (a) DNA sequencing (DNA sequencing (DNA sequencing) (available at Genetic Stock Center [CGSC], New Haven, CT); (b) HTP (high throughput); (c) HPLC (high pressure liquid chromatography); (d) HPLC-UV (HPLC-ultraviolet visible detector); (e) HNMR (proton nuclear magnetic resonance spectroscopy); (e) FIOPC (improvement factor over positive control); (f) Sigma and Sigma-Aldrich (Sigma-Aldrich, St. Louis, MO); (f) Difco (Difco Laboratories, BD Diagnostic Systems, Detroit, MI); (f) Microfluidics (Microfluidics, Westwood, MA); (f) Life Technologies (Life Technologies, part of Fisher Scientific, Waltham, MA); (f) Amresco (Amresco, LLC, Solon, OH); (f) Carbosynth (Carbosynth, Ltd., Berkshire, UK); (f) Varian (Varian Medical Systems, Palo Alto, CA); (f) Agilent (Agilent Technologies, Inc., Santa Clara, CA); (f) Infors (Infors USA Inc., Annapolis Junction, MD); and Thermotron (Thermotron, Inc., Holland, MI).

实施例1Example 1

包含重组尿苷磷酸化酶基因的大肠杆菌表达宿主Escherichia coli expression host containing recombinant uridine phosphorylase gene

用于产生本发明变体的初始尿苷磷酸化酶(UP)从大肠杆菌基因组获得并且克隆到表达载体pCK110900中(参见美国专利申请公布第2006/0195947号的图3),可操作地连接至在lacI阻遏物的控制下的lac启动子。表达载体还包含P15a复制起点和氯霉素抗性基因。使用本领域已知的标准方法将所得质粒转化到大肠杆菌W3110中。如本领域已知的,通过使细胞经历氯霉素选择来分离转化体(参见例如美国专利第8,383,346号和WO2010/144103)。The initial uridine phosphorylase (UP) for producing variants of the present invention is obtained from the Escherichia coli genome and cloned into the expression vector pCK110900 (see Fig. 3 of U.S. Patent Application Publication No. 2006/0195947), which is operably connected to the lac promoter under the control of the lacI repressor. The expression vector also comprises a P15a origin of replication and a chloramphenicol resistance gene. The resulting plasmid is transformed into Escherichia coli W3110 using standard methods known in the art. As known in the art, transformants are isolated by making cells undergo chloramphenicol selection (see, for example, U.S. Patent No. 8,383,346 and WO2010/144103).

实施例2Example 2

包含UP的湿细胞沉淀物及裂解物的HTP制备HTP preparation of wet cell pellets and lysates containing UP

将来自单克隆菌落的包含重组UP编码基因的大肠杆菌细胞接种到96孔浅孔微量滴定板的孔中的包含1%葡萄糖和30μg/mL氯霉素的180μl LB中。将板用O2可透过的密封件(seal)密封,并使培养物在30℃、200rpm和85%湿度生长过夜。然后,将细胞培养物的每一种的10μL转移到含有390mL TB和30μg/mL CAM的96孔深孔板的孔中。用O2可透过的密封件密封深孔板,并在30℃、250rpm和85%湿度培养,直到达到OD600 0.6-0.8。然后将细胞培养物用达到1mM的最终浓度的IPTG诱导,并且在30℃孵育过夜。然后使用4,000rpm、10分钟的离心沉淀细胞。弃去上清液,并且在裂解之前将沉淀物在-80℃冷冻。The E. coli cells containing recombinant UP coding genes from monoclonal colonies are inoculated into 180 μl LB containing 1% glucose and 30 μg/mL chloramphenicol in the holes of 96-well shallow-well microtiter plates. The plate is sealed with O 2 permeable seals (seal), and the culture is grown overnight at 30 ℃, 200rpm and 85% humidity. Then, 10 μL of each of the cell cultures is transferred to the holes of the 96-well deep-well plates containing 390mL TB and 30 μg/mL CAM. The deep-well plates are sealed with O 2 permeable seals, and cultivated at 30 ℃, 250rpm and 85% humidity until OD 600 0.6-0.8 is reached. Then the cell culture is induced with the IPTG reaching the ultimate concentration of 1mM, and incubated overnight at 30 ℃. Then 4,000rpm, 10 minutes of centrifugal precipitation cells are used. The supernatant is discarded, and the precipitate is frozen at -80 ℃ before lysis.

为了裂解,向每个孔中的细胞团(cell paste)中添加包含100mM三乙醇胺(TEoA)缓冲液,pH 7.5,1g/L溶菌酶和0.5g/L硫酸多粘菌素b(PMBS)的400μL裂解缓冲液。伴随在台式振荡器(bench top shaker)上震荡,使细胞在室温裂解2小时。然后将板在4000rpm和4℃离心15min。然后将澄清的上清液用于生物催化反应,以确定其活性水平。For lysis, 400 μL lysis buffer containing 100 mM triethanolamine (TEoA) buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L polymyxin b sulfate (PMBS) was added to the cell paste in each well. The cells were lysed at room temperature for 2 hours with shaking on a bench top shaker. The plate was then centrifuged at 4000 rpm and 4 ° C for 15 min. The clarified supernatant was then used for biocatalytic reactions to determine its activity level.

实施例3Example 3

从摇瓶(SF)培养物制备冻干裂解物Preparation of lyophilized lysates from shake flask (SF) cultures

将如上所述生长的选择的HTP培养物铺到含有1%葡萄糖和30μg/ml CAM的LB琼脂板上并在37℃生长过夜。将来自每种培养物的单个菌落转移到含有1%葡萄糖和30μg/mlCAM的6ml LB中。使培养物在30℃、250rpm生长18小时,并且以约1:50传代培养至含30μg/mlCAM的250ml TB中,至0.05的最终OD600。使培养物在30℃、250rpm生长约195分钟,达到0.6-0.8之间的OD600,并且用1mM IPTG诱导。然后使培养物在30℃、250rpm生长20小时。将培养物以4,000rpm离心20min。弃去上清液,并且将沉淀物重悬于30ml的20mM三乙醇胺,pH 7.5中,并且使用处理机系统(Microfluidics)以18,000psi裂解。使裂解物沉淀(10,000rpm持续60min),并且将上清液冷冻并冻干以产生摇瓶(SF)酶粉末。Selected HTP cultures grown as described above were plated onto LB agar plates containing 1% glucose and 30 μg/ml CAM and grown overnight at 37°C. A single colony from each culture was transferred to 6 ml LB containing 1% glucose and 30 μg/ml CAM. The cultures were grown at 30°C, 250 rpm for 18 hours and subcultured to 250 ml TB containing 30 μg/ml CAM at approximately 1:50 to a final OD 600 of 0.05. The cultures were grown at 30°C, 250 rpm for approximately 195 minutes to an OD 600 between 0.6-0.8 and induced with 1 mM IPTG. The cultures were then grown at 30°C, 250 rpm for 20 hours. The cultures were centrifuged at 4,000 rpm for 20 min. The supernatant was discarded and the pellet was resuspended in 30 ml of 20 mM triethanolamine, pH 7.5 and spun using Processor system (Microfluidics) lysed at 18,000 psi. The lysate was pelleted (10,000 rpm for 60 min) and the supernatant was frozen and lyophilized to produce shake flask (SF) enzyme powder.

实施例4Example 4

相比于SEQ ID NO:2的在5’-异丁酰基尿苷磷酸化中的改进Improvement in 5'-isobutyryl uridine phosphorylation compared to SEQ ID NO: 2

基于筛选变体对5’-异丁酰基尿苷的磷酸化的结果,选择SEQ ID NO:2作为亲本酶(方案II,下文)。Based on the results of screening variants for phosphorylation of 5&apos;-isobutyryluridine, SEQ ID NO:2 was selected as the parent enzyme (Scheme II, below).

方案II的反应是可逆的,并且在研究的初始阶段,逆反应被成功地用作以上方案I中描绘的期望的正向反应的替代物。The reaction of Scheme II is reversible, and in the initial stages of research, the reverse reaction was successfully used as an alternative to the desired forward reaction depicted in Scheme I above.

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4000rpm离心15分钟以去除细胞碎片。The library of engineered genes was generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). The polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, the cell pellet was lysed by adding 400 μL lysis buffer (comprising 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker at room temperature for 2 hours. The plate was centrifuged at 4000 rpm for 15 minutes at 4 ° C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括11.1g/L(3.5mM)5’-异丁酰基尿苷、10mM氯化钾、50mM磷酸钠,pH 7.4和10×稀释的UP裂解物。反应设置如下:(i)将除了UP之外的所有反应组分预混合在单一溶液中,并且然后将87.5μL该溶液等分到96孔板的每个孔中(ii)然后将12.5μL 10×稀释的UP裂解物添加到孔中以开始反应。将反应板用箔密封件热密封,并且在35℃以600rpm震荡孵育18-20小时。用100μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物(reactions)在台式振荡器上震荡3min,随后以4,000rpm在4℃离心10分钟以使任何沉淀物沉淀。然后将上清液(50μL)转移到预先填充有100μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据表1.1中总结的HILIC分析方法分析样品。Reactions were performed in a 96-well format in a 2 mL deep well plate with a total volume of 100 μL. Reactants included 11.1 g/L (3.5 mM) 5'-isobutyryl uridine, 10 mM potassium chloride, 50 mM sodium phosphate, pH 7.4, and 10× diluted UP lysate. The reactions were set up as follows: (i) all reaction components except UP were premixed in a single solution, and 87.5 μL of this solution was then aliquoted into each well of the 96-well plate (ii) 12.5 μL of 10× diluted UP lysate was then added to the wells to start the reaction. The reaction plates were heat sealed with foil seals and incubated at 35°C with shaking at 600 rpm for 18-20 hours. The reactions were quenched with 100 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reactions were shaken on a benchtop shaker for 3 min, followed by centrifugation at 4,000 rpm at 4°C for 10 min to pellet any precipitate. The supernatant (50 μL) was then transferred to a 96-well round-bottom plate pre-filled with 100 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the HILIC analysis method summarized in Table 1.1.

相对于SEQ ID NO:2的活性(活性FIOP)计算为变体的至尿嘧啶的转化%(尿嘧啶峰面积/[尿嘧啶峰面积+5’-异丁酰基尿苷峰面积])相比于用SEQ ID NO:2反应产生的尿嘧啶的转化%。结果示于表1.2中。Activity relative to SEQ ID NO: 2 (Active FIOP) was calculated as the % conversion of the variant to uracil (uracil peak area/[uracil peak area + 5'-isobutyryl uridine peak area]) compared to the % conversion of uracil produced by the reaction with SEQ ID NO: 2. The results are shown in Table 1.2.

实施例5Example 5

相比于SEQ ID NO:4的关于合成5’-异丁酰基尿苷的改进Improvements compared to SEQ ID NO: 4 regarding the synthesis of 5'-isobutyryl uridine

基于关于从5’-异丁酰基核糖-1-磷酸和尿嘧啶合成5’-异丁酰基尿苷筛选变体的结果,选择SEQ ID NO:4作为亲本酶(方案I,以上)。Based on the results of a screen of variants for the synthesis of 5'-isobutyryluridine from 5'-isobutyrylribose-1-phosphate and uracil, SEQ ID NO:4 was selected as the parent enzyme (Scheme I, above).

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4,000rpm离心15分钟以去除细胞碎片。Libraries of engineered genes were generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). Polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, cell pellets were lysed by adding 400 μL lysis buffer (comprising 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker at room temperature for 2 hours. The plate was centrifuged at 4,000 rpm for 15 minutes at 4°C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括67mM 5’-异丁酰基核糖-1-磷酸溶液、7.5g/L(67mM)尿嘧啶、91g/L(267mM,4当量)蔗糖、0.019g/L(0.25wt%wrt尿嘧啶)SUP-101(SEQ ID NO:852)、50mM三乙醇胺,pH 7.5,和100×稀释的UP裂解物。反应设置如下:(i)将除了UP之外的所有反应组分预混合在单一溶液中,并且然后将90μL该溶液等分到96孔板的每个孔中;(ii)然后将10μL 100×稀释的UP裂解物添加到孔中以开始反应。将反应板用箔密封件热密封,并且在30℃以600rpm震荡孵育18-20小时。通过用400μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物在台式振荡器上震荡5min,随后以4,000rpm在4℃离心10分钟以使任何沉淀物沉淀。然后将上清液(30μL)转移到预先填充有120μL 1:9的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据表2.1中总结的反相分析方法分析样品。Reactions were performed in a 96-well format in a 2 mL deep well plate in a total volume of 100 μL. Reactants included 67 mM 5'-isobutyryl ribose-1-phosphate solution, 7.5 g/L (67 mM) uracil, 91 g/L (267 mM, 4 equivalents) sucrose, 0.019 g/L (0.25 wt% wrt uracil) SUP-101 (SEQ ID NO: 852), 50 mM triethanolamine, pH 7.5, and 100× diluted UP lysate. The reactions were set up as follows: (i) all reaction components except UP were premixed in a single solution, and 90 μL of this solution was then aliquoted into each well of a 96-well plate; (ii) 10 μL of 100× diluted UP lysate was then added to the well to start the reaction. The reaction plates were heat sealed with foil seals and incubated at 30°C with shaking at 600 rpm for 18-20 hours. The reaction was quenched by 400 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reaction was shaken on a benchtop shaker for 5 min, followed by centrifugation at 4,000 rpm at 4°C for 10 min to pellet any precipitate. The supernatant (30 μL) was then transferred to a 96-well round-bottom plate pre-filled with 120 μL of a 1:9 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the reverse phase analytical method summarized in Table 2.1.

相对于SEQ ID NO:4的活性(活性FIOP)计算为用变体反应形成的5’-异丁酰基尿苷产物的峰面积相比于用SEQ ID NO:4反应产生的5’-异丁酰基尿苷产物的峰面积。结果示于表2.2中。Activity relative to SEQ ID NO: 4 (Active FIOP) was calculated as the peak area of the 5'-isobutyryl uridine product formed by the reaction of the variant compared to the peak area of the 5'-isobutyryl uridine product produced by the reaction of SEQ ID NO: 4. The results are shown in Table 2.2.

实施例6Example 6

相比于SEQ ID NO:246的关于合成5’-异丁酰基尿苷的改进基于关于从5’-异丁酰基核糖-1-磷酸(化合物(2))和尿嘧啶(化合物(3))合成5’-异丁酰基尿苷(化合物(1))筛选变体的结果,选择SEQ ID NO:246作为亲本酶(方案I)。Improvement over SEQ ID NO: 246 for the synthesis of 5'-isobutyryluridine Based on the results of screening variants for the synthesis of 5'-isobutyryluridine (compound (1)) from 5'-isobutyrylribose-1-phosphate (compound (2)) and uracil (compound (3)), SEQ ID NO: 246 was selected as the parent enzyme (Scheme I).

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4,000rpm离心15分钟以去除细胞碎片。Libraries of engineered genes were generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). Polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, cell pellets were lysed by adding 400 μL lysis buffer (comprising 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker at room temperature for 2 hours. The plate was centrifuged at 4,000 rpm for 15 minutes at 4°C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括67mM 5’-异丁酰基核糖-1-磷酸溶液、7.5g/L(67mM)尿嘧啶、91g/L(267mM,4当量)蔗糖、0.019g/L(0.25wt%wrt尿嘧啶)SUP-101(SEQ ID NO:852)、50mM三乙醇胺,pH 7.5,和100×稀释的UP裂解物。反应设置如下:(i)将除了UP之外的所有反应组分预混合在单一溶液中,并且然后将90μL该溶液等分到96孔板的每个孔中;(ii)然后将10μL 100×稀释的UP裂解物添加到孔中以开始反应。将反应板用箔密封件热密封,并且在30℃以600rpm震荡孵育18-20小时。通过用400μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物在台式振荡器上震荡5min,随后以4,000rpm在4℃离心10分钟以使任何沉淀物沉淀。然后将上清液(30μL)转移到预先填充有120μL 1:9的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据表2.1中总结的反相分析方法分析样品。Reactions were performed in a 96-well format in a 2 mL deep well plate in a total volume of 100 μL. Reactants included 67 mM 5'-isobutyryl ribose-1-phosphate solution, 7.5 g/L (67 mM) uracil, 91 g/L (267 mM, 4 equivalents) sucrose, 0.019 g/L (0.25 wt% wrt uracil) SUP-101 (SEQ ID NO: 852), 50 mM triethanolamine, pH 7.5, and 100× diluted UP lysate. The reactions were set up as follows: (i) all reaction components except UP were premixed in a single solution, and 90 μL of this solution was then aliquoted into each well of a 96-well plate; (ii) 10 μL of 100× diluted UP lysate was then added to the well to start the reaction. The reaction plates were heat sealed with foil seals and incubated at 30°C with shaking at 600 rpm for 18-20 hours. The reaction was quenched by 400 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reaction was shaken on a benchtop shaker for 5 min, followed by centrifugation at 4,000 rpm at 4°C for 10 min to pellet any precipitate. The supernatant (30 μL) was then transferred to a 96-well round-bottom plate pre-filled with 120 μL of a 1:9 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the reverse phase analytical method summarized in Table 2.1.

相对于SEQ ID NO:246的活性(活性FIOP)计算为用变体反应形成的5’-异丁酰基尿苷产物的峰面积相比于用SEQ ID NO:246反应产生的5’-异丁酰基尿苷产物的峰面积。结果示于表3.1中。Activity relative to SEQ ID NO: 246 (Active FIOP) was calculated as the peak area of the 5'-isobutyryl uridine product formed by the reaction of the variant compared to the peak area of the 5'-isobutyryl uridine product produced by the reaction of SEQ ID NO: 246. The results are shown in Table 3.1.

实施例7Example 7

相比于SEQ ID NO:594的关于合成5’-异丁酰基尿苷的改进Improvements compared to SEQ ID NO:594 regarding the synthesis of 5'-isobutyryl uridine

基于关于从5’-异丁酰基核糖-1-磷酸(化合物(2))和尿嘧啶(化合物(3))合成5’-异丁酰基尿苷(化合物(1))筛选变体的结果,选择SEQ ID NO:594作为亲本酶(方案I)。Based on the results of screening variants for the synthesis of 5'-isobutyryluridine (compound (1)) from 5'-isobutyrylribose-1-phosphate (compound (2)) and uracil (compound (3)), SEQ ID NO:594 was selected as the parent enzyme (Scheme I).

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4,000rpm离心15分钟以去除细胞碎片。Libraries of engineered genes were generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). Polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, cell pellets were lysed by adding 400 μL lysis buffer (containing 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker for 2 hours at room temperature. The plate was centrifuged at 4,000 rpm for 15 minutes at 4°C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括67mM 5’-异丁酰基核糖-1-磷酸溶液、7.5g/L(67mM)尿嘧啶、91g/L(267mM,4当量)蔗糖、0.019g/L(0.25wt%wrt尿嘧啶)SUP-101(SEQ ID NO:852)、50mM三乙醇胺,pH 7.5,和100×稀释的UP裂解物。反应设置如下:(i)将除了UP之外的所有反应组分预混合在单一溶液中,并且然后将90μL该溶液等分到96孔板的每个孔中;(ii)然后将10μL 100×稀释的UP裂解物添加到孔中以开始反应。将反应板用箔密封件热密封,并且在30℃以600rpm震荡孵育18-20小时。通过用400μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物在台式振荡器上震荡5min,随后以4,000rpm在4℃离心10分钟以使任何沉淀物沉淀。然后将上清液(30μL)转移到预先填充有120μL 1:9的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据表2.1中总结的反相分析方法分析样品。Reactions were performed in a 96-well format in a 2 mL deep well plate in a total volume of 100 μL. Reactants included 67 mM 5'-isobutyryl ribose-1-phosphate solution, 7.5 g/L (67 mM) uracil, 91 g/L (267 mM, 4 equivalents) sucrose, 0.019 g/L (0.25 wt% wrt uracil) SUP-101 (SEQ ID NO: 852), 50 mM triethanolamine, pH 7.5, and 100× diluted UP lysate. The reactions were set up as follows: (i) all reaction components except UP were premixed in a single solution, and 90 μL of this solution was then aliquoted into each well of a 96-well plate; (ii) 10 μL of 100× diluted UP lysate was then added to the well to start the reaction. The reaction plates were heat sealed with foil seals and incubated at 30°C with shaking at 600 rpm for 18-20 hours. The reaction was quenched by 400 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reaction was shaken on a benchtop shaker for 5 min, followed by centrifugation at 4,000 rpm at 4°C for 10 min to pellet any precipitate. The supernatant (30 μL) was then transferred to a 96-well round-bottom plate pre-filled with 120 μL of a 1:9 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the reverse phase analytical method summarized in Table 2.1.

相对于SEQ ID NO:594的活性(活性FIOP)计算为用变体反应形成的5’-异丁酰基尿苷产物的峰面积相比于用SEQ ID NO:594反应产生的5’-异丁酰基尿苷产物的峰面积。结果示于表4.1中。Activity relative to SEQ ID NO: 594 (Active FIOP) was calculated as the peak area of the 5'-isobutyryl uridine product formed by the reaction of the variant compared to the peak area of the 5'-isobutyryl uridine product produced by the reaction of SEQ ID NO: 594. The results are shown in Table 4.1.

实施例8Example 8

相比于SEQ ID NO:776的关于合成5’-异丁酰基尿苷的改进Improvements compared to SEQ ID NO: 776 regarding the synthesis of 5'-isobutyryl uridine

基于关于从5’-异丁酰基核糖(化合物(4))和尿嘧啶(化合物(3))合成5’-异丁酰基尿苷(化合物(1))筛选变体的结果,选择SEQ ID NO:776作为亲本酶(方案III)。Based on the results of screening variants for the synthesis of 5'-isobutyryluridine (compound (1)) from 5'-isobutyrylribose (compound (4)) and uracil (compound (3)), SEQ ID NO: 776 was selected as the parent enzyme (Scheme III).

尿苷磷酸化酶反应是可逆的。初始实验检查“反向”反应或化合物(1)经由UP酶反应的磷酸化(方案II)。随后的实验研究了“正向”反应或从化合物(2)以及(3)产生化合物(1)(如方案I中示出的)。此外,还在与化合物(2)原位产生的反应中以“正向”方向研究了UP酶变体。在该实施例中,化合物(2)在反应中经由演化的5-甲基硫代聚糖(MTR)激酶产生,该激酶催化化合物(4)核糖部分的C-1羟基基团的磷酸化。可以分析同一反应中的UP酶变体将由MTR激酶产生的化合物(2)转化为化合物(1)形式的能力。以上方案III总结了整个级联反应。下文描述的级联反应还包括支持激酶反应的辅助酶:(1)丙酮酸氧化酶(与TPP、FAD和丙酮酸)既消耗磷酸(使平衡朝向产物驱动)又产生乙酰磷酸,(2)过氧化氢酶消耗由丙酮酸氧化酶形成的过氧化物副产物,和(3)乙酸激酶产生ATP(从ADP+乙酰磷酸),其随后被演化的MTR激酶使用。在该级联中筛选UP酶变体,因为它是可能的规模化过程的示例。其他类似的酶级联也是可用的。The uridine phosphorylase reaction is reversible. Initial experiments examined the "reverse" reaction or the phosphorylation of compound (1) via the UP enzyme reaction (Scheme II). Subsequent experiments studied the "forward" reaction or the production of compound (1) from compounds (2) and (3) (as shown in Scheme I). In addition, UP enzyme variants were also studied in the "forward" direction in a reaction produced in situ with compound (2). In this embodiment, compound (2) is produced in the reaction via an evolved 5-methylthioglycan (MTR) kinase, which catalyzes the phosphorylation of the C-1 hydroxyl group of the ribose portion of compound (4). The ability of the UP enzyme variants in the same reaction to convert compound (2) produced by the MTR kinase into the form of compound (1) can be analyzed. Scheme III above summarizes the entire cascade reaction. The cascade described below also includes auxiliary enzymes that support the kinase reaction: (1) pyruvate oxidase (with TPP, FAD, and pyruvate) both consumes phosphate (driving the equilibrium toward the product) and produces acetyl phosphate, (2) catalase consumes the peroxide byproduct formed by pyruvate oxidase, and (3) acetate kinase produces ATP (from ADP + acetyl phosphate), which is then used by the evolved MTR kinase. UP enzyme variants were screened in this cascade because it is an example of a possible scalable process. Other similar enzyme cascades are also available.

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4,000rpm离心15分钟以去除细胞碎片。Libraries of engineered genes were generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). Polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, cell pellets were lysed by adding 400 μL lysis buffer (comprising 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker at room temperature for 2 hours. The plate was centrifuged at 4,000 rpm for 15 minutes at 4°C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括72.6g/L(330mM)5’-异丁酰基核糖溶液(19.5wt%)、37g/L(330mM)尿嘧啶、0.76g/L(0.5mol%)TPP、0.94g/L(0.5mol%)ATP、0.14g/L(0.05mol%)FAD、46.3g/L(412mM)丙酮酸、3.7g/L(33mM)K2HPO4、0.5wt%乙酸激酶(ACK-101Codexis,Inc.)、1.9wt%演化的MTR-激酶(SEQ ID NO:1198)、0.5wt%过氧化氢酶、0.8wt%丙酮酸氧化酶(SEQ ID NO:1200)和10mM MgCl2。反应设置如下:(i)将丙酮酸和磷酸二氢钾的水溶液冷却至0℃,然后添加5’-异丁酰基核糖-1-磷酸溶液(按份)。使用8N KOH将所得溶液的pH调节至7.2,同时维持在0℃。向该溶液中添加:MgCl2、TPP、FAD和ATP,随后是AcK、MTR激酶、过氧化氢酶和丙酮酸氧化酶。然后将所得溶液(90μL)等分到96孔板的每个孔中。(ii)然后将10μL 100×稀释的UP裂解物添加至孔中以开始反应。用透气胶带密封反应板,并且在85%湿度伴随250rpm震荡在25℃孵育18-20h。通过用1000μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物在台式振荡器上有力地震荡15min,随后以4,000rpm在室温离心10分钟以使任何沉淀物沉淀。然后将上清液(15μL)转移到预先填充有120μL 1:9的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据下文表5.1中总结的反相分析方法分析样品。Reactions were performed in a 96-well format in 2 mL deep well plates in a total volume of 100 μL. The reactants included 72.6 g/L (330 mM) 5'-isobutyryl ribose solution (19.5 wt%), 37 g/L (330 mM) uracil, 0.76 g/L (0.5 mol%) TPP, 0.94 g/L (0.5 mol%) ATP, 0.14 g/L (0.05 mol%) FAD, 46.3 g/L (412 mM) pyruvate, 3.7 g / L (33 mM) K2HPO4, 0.5 wt% acetate kinase (ACK-101 Codexis, Inc.), 1.9 wt% evolved MTR-kinase (SEQ ID NO: 1198), 0.5 wt% catalase, 0.8 wt% pyruvate oxidase (SEQ ID NO: 1200), and 10 mM MgCl2 . The reaction was set up as follows: (i) An aqueous solution of pyruvate and potassium dihydrogen phosphate was cooled to 0°C and then 5'-isobutyryl ribose-1-phosphate solution (in portions) was added. The pH of the resulting solution was adjusted to 7.2 using 8N KOH while maintaining at 0°C. To this solution were added: MgCl2 , TPP, FAD and ATP, followed by AcK, MTR kinase, catalase and pyruvate oxidase. The resulting solution (90 μL) was then aliquoted into each well of a 96-well plate. (ii) 10 μL of 100× diluted UP lysate was then added to the wells to start the reaction. The reaction plate was sealed with breathable tape and incubated at 25°C for 18-20 h at 85% humidity with shaking at 250 rpm. The reaction was quenched by 1000 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reactions were shaken vigorously on a benchtop shaker for 15 min, followed by centrifugation at 4,000 rpm for 10 min at room temperature to pellet any precipitate. The supernatant (15 μL) was then transferred to a 96-well round-bottom plate pre-filled with 120 μL of a 1:9 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the reverse phase analytical method summarized in Table 5.1 below.

相对于SEQ ID NO:776的活性(活性FIOP)计算为用变体反应形成的5’-异丁酰基尿苷产物的峰面积相比于用SEQ ID NO:776反应产生的5’-异丁酰基尿苷产物的峰面积。结果示于表5.2中。Activity relative to SEQ ID NO: 776 (Active FIOP) was calculated as the peak area of the 5'-isobutyryl uridine product formed by the reaction of the variant compared to the peak area of the 5'-isobutyryl uridine product produced by the reaction of SEQ ID NO: 776. The results are shown in Table 5.2.

实施例9Example 9

相比于SEQ ID NO:868的关于合成5’-异丁酰基尿苷的改进Improvements compared to SEQ ID NO: 868 regarding the synthesis of 5'-isobutyryl uridine

基于关于从5’-异丁酰基核糖(化合物(4))和尿嘧啶(化合物(3))合成5’-异丁酰基尿苷(化合物(1))筛选变体的结果,选择SEQ ID NO:868作为亲本酶(方案III)。Based on the results of screening variants for the synthesis of 5'-isobutyryluridine (compound (1)) from 5'-isobutyrylribose (compound (4)) and uracil (compound (3)), SEQ ID NO: 868 was selected as the parent enzyme (Scheme III).

使用已确立的技术(例如,饱和诱变和先前鉴定的有益突变的重组)产生工程化基因的文库。如实施例2中描述地以HTP产生每种基因编码的多肽。对于所有变体,细胞沉淀物通过添加400μL裂解缓冲液(包含100mM三乙醇胺缓冲液,pH 7.5,1g/L溶菌酶和0.5g/LPMBS)并且在室温在台式震荡器上震荡2小时来裂解。将板在4℃以4,000rpm离心15分钟以去除细胞碎片。Libraries of engineered genes were generated using established techniques (e.g., saturation mutagenesis and recombination of previously identified beneficial mutations). Polypeptides encoded by each gene were generated with HTP as described in Example 2. For all variants, cell pellets were lysed by adding 400 μL lysis buffer (comprising 100 mM triethanolamine buffer, pH 7.5, 1 g/L lysozyme and 0.5 g/L PMBS) and shaking on a benchtop shaker at room temperature for 2 hours. The plate was centrifuged at 4,000 rpm for 15 minutes at 4°C to remove cell debris.

反应在96孔形式中、在2mL深孔板中以100μL总体积进行。反应物包括72.6g/L(330mM)5’-异丁酰基核糖溶液(19.5wt%)、37g/L(330mM)尿嘧啶、0.76g/L(0.5mol%)TPP、0.94g/L(0.5mol%)ATP、0.14g/L(0.05mol%)FAD、46.3g/L(412mM)丙酮酸、3.7g/L(33mM)K2HPO4、0.5wt%乙酸激酶(ACK-101Codexis,Inc.)、1.9wt%演化的MTR-激酶(SEQ ID NO:1198)、0.5wt%过氧化氢酶、0.8wt%丙酮酸氧化酶(SEQ ID NO:1200)和10mM MgCl2。反应设置如下:(i)将丙酮酸和磷酸二氢钾的水溶液冷却至0℃,然后添加5’-异丁酰基核糖-1-磷酸溶液(按份)。使用8N KOH将所得溶液的pH调节至7.2,同时维持在0℃。向该溶液中添加:MgCl2、TPP、FAD和ATP,随后是AcK、MTR激酶、过氧化氢酶和丙酮酸氧化酶。然后将所得溶液(90μL)等分到96孔板的每个孔中。(ii)然后将10μL 100×稀释的UP裂解物添加至孔中以开始反应。用透气胶带密封反应板,并且在85%湿度伴随250rpm震荡在25℃孵育18-20h。通过用1000μL 1:1的乙腈在20mM TEoA pH 7.5缓冲液中的混合物猝灭反应。将猝灭的反应物在台式振荡器上有力地震荡15min,随后以4,000rpm在室温离心10分钟以使任何沉淀物沉淀。然后将上清液(15μL)转移到预先填充有120μL 1:9的乙腈在20mM TEoA pH 7.5缓冲液中的混合物的96孔圆底板中。根据表5.1中总结的反相分析方法分析样品。Reactions were performed in a 96-well format in 2 mL deep well plates in a total volume of 100 μL. The reactants included 72.6 g/L (330 mM) 5'-isobutyryl ribose solution (19.5 wt%), 37 g/L (330 mM) uracil, 0.76 g/L (0.5 mol%) TPP, 0.94 g/L (0.5 mol%) ATP, 0.14 g/L (0.05 mol%) FAD, 46.3 g/L (412 mM) pyruvate, 3.7 g / L (33 mM) K2HPO4, 0.5 wt% acetate kinase (ACK-101 Codexis, Inc.), 1.9 wt% evolved MTR-kinase (SEQ ID NO: 1198), 0.5 wt% catalase, 0.8 wt% pyruvate oxidase (SEQ ID NO: 1200), and 10 mM MgCl2 . The reaction was set up as follows: (i) An aqueous solution of pyruvate and potassium dihydrogen phosphate was cooled to 0°C and then 5'-isobutyryl ribose-1-phosphate solution (in portions) was added. The pH of the resulting solution was adjusted to 7.2 using 8N KOH while maintaining at 0°C. To this solution were added: MgCl2 , TPP, FAD and ATP, followed by AcK, MTR kinase, catalase and pyruvate oxidase. The resulting solution (90 μL) was then aliquoted into each well of a 96-well plate. (ii) 10 μL of 100× diluted UP lysate was then added to the wells to start the reaction. The reaction plate was sealed with breathable tape and incubated at 25°C for 18-20 h at 85% humidity with shaking at 250 rpm. The reaction was quenched by 1000 μL of a 1:1 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The quenched reactions were shaken vigorously on a benchtop shaker for 15 min, followed by centrifugation at 4,000 rpm for 10 min at room temperature to pellet any precipitate. The supernatant (15 μL) was then transferred to a 96-well round-bottom plate pre-filled with 120 μL of a 1:9 mixture of acetonitrile in 20 mM TEoA pH 7.5 buffer. The samples were analyzed according to the reverse phase analytical method summarized in Table 5.1.

相对于SEQ ID NO:868的活性(活性FIOP)计算为用变体反应形成的5’-异丁酰基尿苷产物的峰面积相比于用SEQ ID NO:868反应产生的5’-异丁酰基尿苷产物的峰面积。结果示于表6.1中。Activity relative to SEQ ID NO: 868 (Active FIOP) was calculated as the peak area of the 5'-isobutyryl uridine product formed by the reaction of the variant compared to the peak area of the 5'-isobutyryl uridine product produced by the reaction of SEQ ID NO: 868. The results are shown in Table 6.1.

出于所有目的,本申请中引用的所有出版物、专利、专利申请和其他文件在此通过引用以其整体并入,其程度如同每个单独的出版物、专利、专利申请或其他文件被单独地指出出于所有目的通过引用并入一样。All publications, patents, patent applications, and other documents cited in this application are hereby incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, or other document was individually indicated to be incorporated by reference for all purposes.

虽然已经说明和描述了各种特定实施方案,但是将理解,可以做出各种改变而不偏离本发明的精神和范围。While various specific embodiments have been illustrated and described, it will be appreciated that various changes can be made without departing from the spirit and scope of the invention.

Claims (31)

1. An engineered uridine phosphorylase comprising a polypeptide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 246, SEQ ID No. 594, SEQ ID No. 776 and/or SEQ ID No. 868 or a functional fragment thereof, wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions, and wherein the amino acid position of the polypeptide sequence is numbered with reference to SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 246, SEQ ID No. 594, SEQ ID No. 776 and/or SEQ ID No. 868.
2. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 2, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions in the polypeptide sequence selected from the group consisting of: 45/51/80/188/189/241, 16, 7, 9, 14, 14/38/40/146/147/179/235/236, 14/38/86/146/147/235/236/240, 14/40, 14/40/86/147/193/236/240, 14/40/136/179/236/240, 14/40/146/235/236, 14/40/147/181/193/235, 14/40/235, 14/86/146, 14/146/147/181/240, 14/146/236/240, 14/147/193/235/236/240, 14/179/181/193/235/240, 14/235/236, 29, 31, 38/40/86/146/147/179/181, 38/40/86/147/236/240, 40, 40/43/86/146/240, 40/43/86/147/235, 40/43/146/147, 40/43/147/179/236/240, 40/43/147/179/240, 40/43/147/236/240, 40/86/146/235/236, 40/86/147/235/236/240, 40/86/179/235/240, 40/86/235/236/240, and combinations thereof, 40/146/147/240, 40/147/240, 40/235, 40/235/236/240, 40/236/240, 42/235/236, 43/86/147/181/240, 43/146/147/235/236/240, 43/146/179/240, 43/147, 43/147/179/181, 47/88, 64, 73, 80, 86, 86/136/146/147/179/181, 86/136/146/147/179/235/236, 86/147/179/181, 86/235, 86/235/236/240, 86/236/240, 86/240, 92, 97, 99, 103/249, 104, 105, 106, 110, 146/147, 146/147/235/236, 146/235/240, 146/236/240, 146/240, 147/179/181, 147/235/240/249, 157, 167, 179/181, 179/181/193/240, 181, 184, 216, 226, 228, 231, 233, 235/236, 236/240, 237, 239, 240 and 245, wherein the amino acid position of the polypeptide sequence is referenced to SEQ ID NO: 2.
3. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID No. 4, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions selected from the group consisting of: 3. 3/9/216, 3/9/216/236, 3/9/235/237, 3/31/47/179/181/216, 3/31/47/179/181/237, 3/31/47/179/216, 3/31/47/179/216/237, 3/31/179, 3/31/179/181, 3/31/179/181/237, 3/31/179/216, 3/31/181/216, 3/31/181/237, 3/47/179/181/216, 3/47/179/216/237, 3/47/181, 3/179/181, 3/179/181/216, 3/179/181/237, 3/179/216, 3/179/216/237, 3/179/237, 3/181/216, 3/181/216/237, 3/216/236/240, 9/216/236/237, 9/237, 13, 24, 31/47, 31/47/179/181/237, 31/47/179/216, 31/47/181, 31/47/216, 31/179/181/216, 31/181/216, 31/181/216/237, 31/181/237, 31/216, 31/216/237, 31/236/237/240, 31/237, 33, 46, 47, 47/147/181/231, 47/179/181, 47/179/181/216, 47/179/184, 47/179/216, 47/181/216, 47/181/216/237, 47/181/231, 47/216, 52, 63, 67, 83, 87/160, 92, 95, 97, 99, 100, 101, 105, 106, 108, 111, 137, 151, 152, 155, 159, 160, 170, 173, 177, 179/181/216, 179/181/216/237, 179/181/231, 179/181/241, 179/216, 179/228/231, 179/237, 181/216, 181/237, 183, 185, 188, 189, 191, 201, 216/237, 218, 222, 228, 231, 233, 235/236, 236/237, 240, 241 and 248, wherein the amino acid position of the polypeptide sequence is referenced to SEQ ID NO: 4.
4. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 246, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions selected from the group consisting of: 3/24/33/47/100/183/185, 3/24/33/47/100/216/228/233, 3/24/33/100/183/185/228, 3/24/33/108, 3/24/47/100, 3/24/47/100/108/111/160/185/233/241, 3/24/47/108/160/241, 3/24/47/160/189, 3/24/47/189/228/233, 3/24/47/228, 3/24/47/228/233, 3/24/95/100, 3/24/95/100/160/189/228/241, 3/24/100, 3/24/100/160/218/241, 3/24/111, 3/24/111/183/228/233/241, 3/24/111/228/233, 3/24/183/185/216, 3/24/189/233, 3/33/47/95/100/241, 3/33/47/100/108/189/216/228/233, 3/33/47/100/111/228, 3/33/47/100/111/233/241, 3/33/47/100/216, 3/33/47/108/111/233, 3/33/47/108/189/233/241, 3/33/160/233, 3/47, 3/47/95/100/108/189/233, 3/47/95/100/111/241, 3/47/95/160/189, 3/47/100/108/183/185/189/241, 3/47/100/160/185, 3/47/100/185/189/228, 3/47/108/111, 3/47/183/189/228/233, 3/47/189, 3/47/228/233, 3/95/100/160/228/233, 3/95/100/183/216/228/233, 3/95/100/183/233, 3/95/185/189/216, 3/95/189, 3/95/233, 3/160, 3/183/185/189/228/233, 3/183/189/228/233, 3/185/189, 3/189, 24/33/47, 24/33/47/228/241, 24/33/100/108/241, 24/47/95/100, 24/47/95/100/160/228/233/241, 24/47/185/216/218, 24/47/216, 24/95/183, 24/100/160/233, 24/160/183/185, 24/189/228/233, 33/47/95/100/233, 33/47/95/100/233/241, 33/47/160, 33/47/233, 33/100/183/185, 33/100/185/233, 47, 47/100/111/233, 47/100/189, 47/100/189/233, 47/108/160/228/241, 47/111, 47/160/185/189/233, 47/228/233, 95/100/183, 95/100/189, 95/100/228, 95/100/228/233, 95/100/233, 100/160/185, 100/228/233, 108, 108/183/189/233, 108/185/216/228/233, 160/233, 228 and 228/233, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID No. 246.
5. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID No. 594, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions selected from the group consisting of: 6. 6/9/29/40/100/121/126/179/181/189/237, 6/9/121/179/181, 6/46/52/63/97/121/126/179, 6/52/180/181, 6/63/126/179/242, 9, 9/40/46/97/100/106/135/179/181/207/231, 9/52/126/189/242, 9/97/100/106/126/180/207/231, 9/181/242, 29, 40, 46, 52/63/126, 52/179/189, 61, 63, 97, 100, 106, 121, 126/180/189, 135, 142, 179, 180, 181, 189, 201, 207, 230, 231, 236, 237 and 242, wherein the amino acid position of the polypeptide sequence is numbered with reference to SEQ ID NO 594.
6. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID No. 776, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions selected from the group consisting of: 3. 3/8/22/36/181/235/250, 6, 6/45/51/81/126/226/233, 6/45/51/144, 6/45/51/188/189, 6/45/51/189/208, 6/45/51/189/228/236, 6/45/51/208/233, 6/45/149/188/189/208/233/236, 6/51/126/144/208, 6/51/126/189/208/233/236, 6/51/126/189/231/236, 6/51/126/189/233, 6/51/188/189/208/226/228, 6/51/188/189/236, 6/51/189/208/231/233, 6/51/208/226/233, 6/51/208/231, 6/126/188/231/233, 6/144/208, 6/188/189/208/228/233, 8, 8/36/143/147/235, 8/36/181/235, 8/142/147, 8/147/181/250, 8/147/235, 9, 19, 20, 22, 22/147/181/235/250, 24, 36, 36/143/147/181, 36/143/147/235, 40, 41, 43, 45/51, 45/51/126/144/208/226/228, 45/51/144, 45/51/144/208/226/231/233, 45/51/188/189, 45/51/189/233, 45/51/208/226/231, 45/51/208/233, 45/51/226, 45/126/189 45/126/189/208/226, 45/144/189/228, 45/144/226/231/233, 45/188/189, 45/188/189/208/228, 45/188/189/226/228, 45/188/189/231/233, 45/189/208, 46, 51, 51/126/144/208, 51/126/144/226/231/233/236 51/126/208, 51/144/226, 51/188/189/228, 51/189, 51/189/208/226, 51/208, 51/233, 57, 58, 80/135/147, 81, 81/126/144/188/208/228, 86, 103, 126, 126/144/188/189/226, 134, 135, 141, 142, 143/147/235, 144/188/228, 146, 147. 149, 181, 188/189/233, 189, 207, 208/226/233, 208/228, 208/231/233, 226, 228, 230, 231, 232, 233, 235, 236, 240 and 250, wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID No. 776.
7. The engineered uridine phosphorylase according to claim 1, wherein the polypeptide sequence of the engineered uridine phosphorylase has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 868, and wherein the polypeptide sequence of the engineered uridine phosphorylase comprises at least one substitution or set of substitutions at one or more positions selected from the group consisting of: 6. 6/8/9/126/147, 6/9/24/181/189, 6/9/181/189/235, 6/9/208/233/235, 6/24, 6/24/43/46/181/189, 6/24/43/126/147/189, 6/24/46/103/181/208, 6/24/46/147/240, 6/24/103/189, 6/24/126/189, 6/24/147, 6/46/126/147/181/235/240, 6/46/147/189/240, 6/103, 6/103/147/230/233, 6/103/189/235, 6/126/181/189, 6/126/181/189/235, 6/126/233/235, 6/181/230/233/235, 9/43/46/103/189/233/240, 9/46/126/147/181, 9/46/147/233, 24/43/46/147/230/235, 24/46/126, 24/46/147/189, 24/46/208/230/233/235, 24/103/126/147, 24/103/126/147/181/189/208/233, 24/147, 24/147/189/230/233, 24/181/189/230/233/235, 24/189/230, 24/208, 43/46/126/147/189/240, 43/103/189/208/233, 103/126/189/233/235, 103/147/181, 147/181/233, 189/235 and 208/233, wherein the amino acid positions of the polypeptide sequences are numbered with reference to SEQ ID NO: 868.
8. The engineered uridine phosphorylase according to claim 1, wherein said engineered uridine phosphorylase comprises a polypeptide sequence which is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered uridine phosphorylase variant listed in table 1.2, table 2.2, table 3.1, table 4.1, table 5.2 and/or table 6.1.
9. The engineered uridine phosphorylase according to claim 1, wherein the engineered uridine phosphorylase comprises a polypeptide sequence which is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 246, SEQ ID No. 594, SEQ ID No. 776 and/or SEQ ID No. 868.
10. The engineered uridine phosphorylase according to claim 1, wherein said engineered uridine phosphorylase comprises the variant engineered uridine phosphorylase listed in SEQ ID No. 4, SEQ ID No. 246, SEQ ID No. 594, SEQ ID No. 776 and/or SEQ ID No. 868.
11. The engineered uridine phosphorylase according to claim 1, wherein the engineered uridine phosphorylase comprises a polypeptide sequence which is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered uridine phosphorylase variant listed in the even numbered sequence in SEQ ID No. 4-1196.
12. The engineered uridine phosphorylase according to claim 1, wherein the engineered uridine phosphorylase comprises a polypeptide sequence listed in at least one of the even numbered sequences in SEQ ID NOs 4-1196.
13. The engineered uridine phosphorylase according to any of claims 1-12, wherein said engineered uridine phosphorylase comprises at least one improved property compared to a wild-type Escherichia coli (Escherichia coli) uridine phosphorylase.
14. The engineered uridine phosphorylase according to claim 13, wherein the improved property comprises improved activity towards one or more substrates.
15. The engineered uridine phosphorylase according to claim 14, wherein the one or more substrates comprise 5' -isobutyrylribose-1-phosphate (compound (2)) and/or uracil (compound (3)).
16. The engineered uridine phosphorylase according to any of claims 13-15, wherein said improved property comprises improved production of 5' -isobutyryl uridine compound (1).
17. The engineered uridine phosphorylase according to any of claims 1-16, wherein said engineered uridine phosphorylase is purified.
18. The engineered uridine phosphorylase according to any of claims 1-17, wherein the engineered uridine phosphorylase is part of a multi-enzyme system for producing nucleoside analogues.
19. A composition comprising at least one engineered uridine phosphorylase according to any of claims 1-16.
20. A polynucleotide sequence encoding at least one engineered uridine phosphorylase according to any of claims 1-18.
21. A polynucleotide sequence encoding at least one engineered uridine phosphorylase, said polynucleotide sequence comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 1, SEQ ID No. 3, SEQ ID No. 245, SEQ ID No. 593, SEQ ID No. 775 and/or SEQ ID No. 867, wherein the polynucleotide sequence of said engineered uridine phosphorylase comprises at least one substitution at one or more positions.
22. A polynucleotide sequence encoding at least one engineered uridine phosphorylase, or a functional fragment thereof, said polynucleotide sequence comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity with SEQ ID No. 1, SEQ ID No. 3, SEQ ID No. 245, SEQ ID No. 593, SEQ ID No. 775 and/or SEQ ID No. 867.
23. The polynucleotide sequence of any one of claims 20-22, wherein the polynucleotide sequence is operably linked to a control sequence.
24. The polynucleotide sequence of any one of claims 20-23, wherein the polynucleotide sequence is codon optimized.
25. The polynucleotide sequence of any one of claims 20-24, wherein the polynucleotide sequence comprises the polynucleotide sequence set forth in the odd-numbered sequences of SEQ ID NOs 3-1195.
26. An expression vector comprising at least one polynucleotide sequence according to any one of claims 20-25.
27. A host cell comprising at least one expression vector according to claim 26.
28. A host cell comprising at least one polynucleotide sequence according to any one of claims 20-25.
29. A method of producing an engineered uridine phosphorylase in a host cell, the method comprising culturing the host cell according to claim 27 and/or 28 under suitable conditions such that at least one engineered uridine phosphorylase is produced.
30. The method of claim 29, further comprising recovering at least one engineered uridine phosphorylase from the culture and/or host cell.
31. The method of claim 29 and/or 30, further comprising the step of purifying the at least one engineered uridine phosphorylase.
CN202180084994.8A 2020-12-18 2021-12-17 Engineered uridine phosphorylase variant enzymes Pending CN116761614A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/127,431 2020-12-18
US202163148324P 2021-02-11 2021-02-11
US63/148,324 2021-02-11
PCT/US2021/064161 WO2022133289A2 (en) 2020-12-18 2021-12-17 Engineered uridine phosphorylase variant enzymes

Publications (1)

Publication Number Publication Date
CN116761614A true CN116761614A (en) 2023-09-15

Family

ID=87953838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180084994.8A Pending CN116761614A (en) 2020-12-18 2021-12-17 Engineered uridine phosphorylase variant enzymes

Country Status (1)

Country Link
CN (1) CN116761614A (en)

Similar Documents

Publication Publication Date Title
US12110493B2 (en) Engineered purine nucleoside phosphorylase variant enzymes
US11198861B2 (en) Engineered phenylalanine ammonia lyase polypeptides
US12421509B2 (en) Engineered phosphopentomutase variant enzymes
US12110513B2 (en) Engineered pantothenate kinase variant enzymes
JP7678580B2 (en) Engineered Sucrose Phosphorylase Variant Enzymes
US20240301367A1 (en) Peroxidase activity towards 10-acetyl-3,7-dihydroxyphenoxazine
US12428629B2 (en) Engineered uridine phosphorylase variant enzymes
CN116761614A (en) Engineered uridine phosphorylase variant enzymes
CN116710558A (en) Engineered pentose phosphate mutase variant enzyme
CN116615534A (en) Engineered pantothenate kinase variant enzymes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination