一种融合蛋白、碱基编辑工具及其应用

文档序号:3296 发布日期:2021-09-17 浏览:57次 英文

一种融合蛋白、碱基编辑工具及其应用

技术领域

本发明属于基因编辑

技术领域

,具体涉及一种融合蛋白、碱基编辑工具及其应用。

背景技术

CRISPR/Cas9系统是细菌用来防御噬菌体DNA注入和质粒转移的天然防御系统,自被发现以后被人类广泛开发和利用,构建了依赖于引导RNA(gRNA)靶向作用的DNA编辑系统和平台,主要用于靶向基因组编辑、转录调控、表观基因编辑等。Cas9系统的主要作用原理是通过gRNA中的tracrRNA来招募Cas9蛋白,与gRNA结合后使得Cas9从一个未激活的构象变成具有DNA识别能力的构象。经典CRISPR/Cas9系统的crRNA前20碱基使得Cas9具有靶序列特异性,gRNA和Cas9蛋白复合物在DNA序列上寻找Cas9蛋白的识别前间隔序列邻近基序(PAM,protospacer adjacent motif;靶基因组上的特定碱基,经典SpCas9的PAM为NGG),在成功识别PAM位点后,Cas9使DNA局部解链,gRNA进入后与DNA互补,形成RNA-DNA互补结构,最终gRNA与目标DNA完全互补使得Cas9蛋白的HNH活性域形成具有稳定的具有活性的构象来剪切目标链DNA。与此同时,引起更大的构象变化,使得非目标链DNA进入RuvC活性域被其剪切[1]。RuvC结构域内的D10和HNH结构域内的H840分别对两个结构域的切割活性至关重要,引入D10A或H840A突变以后使得Cas9变为仅有单链切割活性的Cas9 nickase(Cas9n),当同时引入两个突变时则变为仅有靶向DNA结合活性而无核酸内切酶活性的dCas9。

在Cas9n和dCas9基础上,开发了一系列基因组或表观基因组编辑工具,基本策略是在Cas9n和dCas9的末端连接具有特定功能的催化酶或表观遗传因子,利用Cas9n和dCas9的靶向活性,在gRNA的引导下将特定功能因子转运到特定基因组位点,实现特定位点基因编辑、表观修饰编辑、转录激活或抑制等。其中最为经典的一类定点编辑工具为单碱基编辑工具(base editor),即在Cas9n蛋白的N端连接DNA脱氨酶,其在gRNA序列下由Cas9转运到靶向DNA序列范围内,对特定的核苷酸进行脱氨反应,并且在脱氨碱基链的互补链上利用Cas9n(D10A)的切割活性造成单链切口,再通过碱基修复机制和DNA复制实现碱基的精确替换。第一类胞嘧啶碱基编辑器CBE(Cytidine base editor)首先由哈弗大学David Liu实验室报道,通过将rat APOBEC1胞嘧啶脱氨酶与dCas9蛋白融合得到了第一个胞嘧啶碱基编辑器。并且为了提高编辑效率,他们将尿嘧啶DNA糖基化酶抑制蛋白UGI与Cas9n融合,一致细胞把尿嘧啶重新变为胞嘧啶;为了使细胞优先使用脱氨基的DNA链作为DNA修复模板,DavidLiu实验室进一步将dCas9换成只切割脱氨基链的互补连单链的Cas9n,由此大大提高了CBE的编辑效率,能够高效将碱基C/G替换为T/A(C/G-to-T/A)[2]。此后,David Liu实验室发明了能够实现靶位点碱基A/T替换为G/C(A/T-to-G/C)的腺嘌呤碱基编辑器ABE(Adeninebase editor)[3]。该碱基编辑器通过将RNA腺嘌呤脱氨酶TadA定向进化得到能够对DNA腺嘌呤进行脱氨的TadA*,将TadA*/TadA二聚体与Cas9n蛋白融合,得到了具有高效腺嘌呤编辑活性的ABE7.0[4]

此后,许多实验室都开始对碱基编辑器进行改造和优化,包括不同脱氨酶和Cas9蛋白的组合和优化,得到了不同类型不同特征的碱基编辑器,使得碱基编辑器的编辑效率和编辑范围得到了大大提高。其中,最为重要的是David Liu实验室发明的第四代碱基编辑器ancBE4max,通过使用ancAPOBEC1代替rat APOBEC1、融合两个UGI、增加APOBEC1-Cas9n和Cas9n-UGI间linker的长度、优化核定位信号序列(NLS,nuclear localization signal)等,极大提高了编辑产物的纯度和效率比例[5]。ancBE4max识别的PAM为NGG,对应的编辑窗口是gRNA范围内5’端的4-8位,其Cas9n来源于化脓链球菌(Streptococcus pyogenes,SpCas9;共计1369个氨基酸)。然而,ancBE4max的靶向窗口和PAM限制性(主要识别NGG序列的PAM),大幅限制了基因组中可被靶向的范围。

因此,科学家开发了一系列通过蛋白质工程和定向进化获得SpCas9蛋白突变体与脱氨酶进行组合,从而获得了一系列具有各种靶向特性和识别PAM的碱基编辑器。其中,包括可以识别NGN的xCas9[6]和SpCas9-NG[7]、几乎不受PAM限制的Cas9变体SpRY[8]。科学界亦尝试利用不同种属来源的Cas9同源物与脱氨酶进行组合,例如Nme2Cas9[9]、SaCas9[10]、St1Cas9[11]、xCas9[12]等,从而获得了具有不同编辑特性、不同长度靶向序列、不同识别窗口等的新型编辑器。

基于SpCas9的经典编辑器的编辑窗口主要均为4-8位,并且各类编辑器均存在PAM偏好性或部分位点靶向效率低的情况。而且,经典碱基编辑器的表达质粒大小远超出腺病毒的包装范围,不利于临床研究和应用。因此开发不同编辑窗口、不同识别PAM和表达质粒更小的新型碱基编辑器,是目前基因编辑应用研究和临床应用的关键。

参考文献:

1.Jiang,F.and J.A.Doudna,CRISPR-Cas9 Structures and Mechanisms.AnnuRev Biophys,2017.46:p.505-529.

2.Komor,A.C.,et al.,Programmable editing of a target base in genomicDNA without double-stranded DNA cleavage.Nature,2016.533(7603):p.420-424.

3.Gaudelli,N.M.,et al.,Programmable base editing of A·T to G·C ingenomic DNA without DNA cleavage.Nature,2017.551(7681):p.464-471.

4.Gaudelli,N.M.,et al.,Programmable base editing of AT to GC ingenomic DNA without DNA cleavage.Nature,2017.551:p.464-471.

5.Koblan,L.W.,et al.,Improving cytidine and adenine base editors byexpression optimization and ancestral reconstruction.Nature Biotechnology,2018.36.

6.Hu,J.H.,et al.,Evolved Cas9 variants with broad PAM compatibilityand high DNA specificity.Nature,2018.

7.Engineered CRISPR-Cas9 nuclease with expanded targetingspace.Science(New York,N.Y.),2018.361(6408):p.1259.

8.Walton,R.T.,et al.,Unconstrained genome targeting with near-PAMlessengineered CRISPR-Cas9 variants.Science.368.

9.Edraki,A.,et al.,A Compact,High-Accuracy Cas9 with a DinucleotidePAM for In Vivo Genome Editing.Mol Cell,2019.73(4):p.714-726.e4.

10.Nishimasu,H.,et al.,Crystal Structure of Staphylococcus aureusCas9.Cell,2015.162(5):p.1113-26.

11.Zhang,Y.,et al.,Catalytic-state structure and engineering ofStreptococcus thermophilus Cas9.Nature Catalysis,2020.3(10):p.813-823.

12.Hu,J.H.,et al.,Evolved Cas9 variants with broad PAM compatibilityand high DNA specificity.Nature,2018.556(7699):p.57-63.

发明内容

本发明的目的在于克服现有技术的不足之处而提供一种识别PAM序列为NHAAAA的新型胞嘧啶碱基编辑器,改变单碱基编辑器的编辑窗口,进而扩宽单碱基编辑器的靶向范围。

本发明所采取的技术方案是:

本发明的第一个方面,提供一种融合蛋白,所述融合蛋白包括SsiCas9n多肽,所述SsiCas9n多肽的氨基酸序列为:

(a)SsiCas9 D9A nickase第2~1122位氨基酸序列;或

(b)SEQ ID NO.1所示的氨基酸序列;或

(c)与SEQ ID NO.1所示的氨基酸序列相比具有90%以上序列一致性的氨基酸序列,且具有(a)所限定的氨基酸序列所具有的功能。

在本发明的一些优选实施方式中,所述SsiCas9n多肽的氨基酸序列为能够识别NHAAAA作为PAM,N表示任意碱基。

在本发明的一些优选实施方式中,所述SsiCas9n多肽的氨基酸序列为能够作为Cas9nickase在靶向序列的互补链导致单链DNA切割。

在本发明的一些实施方式中,所述融合蛋白还包含脱氨酶ancAPOBEC1多肽,所述脱氨酶ancAPOBEC1多肽的氨基酸序列为:

(d)SEQ ID NO.3所示的氨基酸序列;或

(e)与SEQ ID NO.3所示氨基酸序列相比具有90%以上序列一致性的氨基酸序列,且具有(d)所限定的功能,优选为具有胞嘧啶脱氨酶功能。

在本发明的一些实施方式中,所述融合蛋白还包含尿嘧啶糖基化酶抑制剂(UGI),所述抑制剂的氨基酸序列为:

(f)SEQ ID NO.4所示氨基酸序列;或

(g)与SEQ ID NO.4所示氨基酸序列相比具有90%以上序列一致性的氨基酸序列,且具有(f)所限定的氨基酸功能,优选为具有尿嘧啶DNA糖基化酶抑制剂功能。

在本发明的一些实施方式中,所述融合蛋白还包含核定位信号肽,优选地,所述核定位信号多肽片段位于融合蛋白的N端和/或C端,所述核位信号多肽片段的氨基酸序列如SEQ ID NO.9所示。

在本发明的一些实施方式中,所述融合蛋白还包括核定位信号多肽、本发明第二方面所述的脱氨酶、第一连接子、本发明第一方面所述的SsiCas9n多肽、第二连接子、本发明第一方面所述的抑制剂和核定位信号多肽。

在本发明的一些实施方式中,所述融合蛋白从N端至C端依次包括BPNLS、ancAPOBEC1多肽片段、第一连接子、SsiCas9 D9A nickase的N端第2~1122个氨基酸组成的多肽片段、第二连接子、2*UGI多肽和BPNLS多肽序列。

在本发明的一些实施方式中,所述第一连接子优选为32aa linker,所述第一连接子优选为10aa linker。

在本发明的一些优选实施方式中,所述融合蛋白的氨基酸序列为:

(h)SEQ ID NO.5所示氨基酸序列;或

(i)与SEQ ID NO.5具有80%以上序列相似性的氨基酸序列、且具有(h)所限定的氨基酸序列的功能,优选的具有胞嘧啶脱氨酶功能,更优选的具有胞嘧啶碱基编辑器功能,更优选为能够识别NHAAAA作为PAM。

本发明还提供一种可以编码本发明第一方面所述SsiCas9 D9A nickase的核酸分子,所述核酸分子的序列为:

(j)如SEQ ID NO.2所示的序列,该序列为经过密码子优化以后适合于真核生物表达的DNA编码序列;或

(k)与SEQ ID NO.1所示的氨基酸序列相比具有90%以上序列一致性的氨基酸序列对应的DNA编码序列,且具有(a)或(j)所限定的功能;或

(l)如SEQ ID NO.2所示的DNA序列具有同义密码子的DNA序列。

本发明的第二个方面,提供一种可以编码本发明第一方面所述融合蛋白的基因。

在本发明的一些实施方式中,所述基因的序列为:

(m)SEQ ID NO.6所示的序列;或

(n)与SEQ ID NO.5所示的氨基酸序列相比具有90%以上序列一致性的氨基酸序列对应的DNA编码序列,且具有(h)或(m)所限定的功能;或

(o)如SEQ ID NO.6所示的DNA序列具有同义密码子的DNA序列。

本发明的第三个方面,提供一种组合物,所述组合物包含一种gRNA和本发明第一方面所述融合蛋白,

其中,所述gRNA是嵌合的非天然存在的向导多核苷酸;

所述gRNA/Cas复合物能完全或部分识别、结合靶序列并使靶序列产生切口或解旋、切割靶序列。

在本发明的一些优选实施方式中,所述gRNA表达元件由U6 promoter、gRNA靶向序列插入酶切位点、scaffold(Ssi特异性)和终止信号依次组成。

在本发明的一些实施方式中,所述scaffold是根据中华链球菌串联重复序列设计,其序列为:

(p)如SEQ ID NO.8所示的DNA序列;或

(q)与SEQ ID NO.8具有80%以上序列相似性的DNA序列、且具有(p)所限定的DNA序列的功能。

在本发明的一些优选实施方式中,所述gRNA的序列为:

(r)如SEQ ID NO.7所示的DNA序列;或

(s)与SEQ ID NO.7具有80%以上序列相似性的DNA序列、且具有(r)所限定的DNA序列的功能。

在本发明的一些优选实施方式中,所述gRNA表达载体,还包括含EGFP标签的编码序列,更优选地,包括靶向特异位点的gRNA。

其中经过真核密码子优化的Cas9蛋白同源物SsiCas9编码序列;可识别NHAAAA为PAM序列,与已报道的碱基编辑器的PAM识别序列不同;设计的gRNA长度为20nt;Ssi-ancBE4max可以将靶向序列5′端3~12位的碱基C转变为碱基T,可靶向已报道的胞嘧啶碱基编辑器不能靶向的位点,从而扩展了单碱基编辑器在全基因组的可靶向范围,为单碱基编辑器的应用提供更多的可选性。

本发明的第四个方面,提供包含本发明第二方面所述基因的重组载体、重细菌或细胞系。

在本发明的一些实施方式中,所述细胞为真核细胞或原核细胞。

在本发明的一些优选实施方式中,所述细胞为小鼠细胞或人细胞。

在本发明的一些优选实施方式中,所述细胞为人胚胎肾细胞。

在本发明的一些更优选实施方式中,所述细胞为HRK293T细胞。

本发明的第五个方面,提供本发明第一方面所述融合蛋白或本发明第二方面所述基因或本发明第三方面所述组合物或本发明第四方面所述重组载体、重组菌或细胞系在基因编辑中的应用。

本发明的第六个方面,提供一种基因编辑方法,具体为使用本发明第一方面所述融合蛋白或本发明第二方面所述基因或本发明第三方面所述组合物或本发明第四方面所述重组载体、重组菌或细胞系进行体内或体外基因编辑。

本发明的有益效果是:

本发明提供了一种基于中华链球菌(Streptococcus sinensis)来源的融合蛋白(编辑碱基器)和一种新的碱基编辑工具,具体为通过将识别NHAAAA的SsiCas9与BE4max相结合,得到一种名为SsiCas9-ancBE4max的新型胞嘧啶碱基编辑器(CBE),经检测,该编辑器可高效诱导编辑窗口5’端3-12位中C-to-T的高效转换,且识别的PAM为NHAAAA。所述编辑工具包括根据中华链球菌的串联重复序列设计的scaffold序列,设计的靶向gRNA长度为20nt,该工具可实现特定碱基(C-to-T)的转变,拓宽了碱基编辑的靶向范围和应用范围。可以识别NHAAAA作为PAM,编辑范围为靶向序列5’端3-12位的胞嘧啶,可高效将胞嘧啶转变为胸腺嘧啶(C-to-T),拓宽了碱基编辑的靶向范围。

并且本发明的碱基编辑工具的蛋白大小可适用于腺病毒的包装要求,具有很好的应用前景。本发明提供的碱基编辑工具可高效诱导编辑窗口5’端3-12位内C-to-T的高效转换,且识别的PAM为NHAAAA,扩展了碱基编辑的基因组靶向范围,提供了碱基编辑和基因校正的工具选择性。本发明提供的碱基编辑器缩小了碱基编辑工具的表达质粒大小,使其更适用于腺病毒(AAV)的包装范围,具有良好的基因治疗前景和产业化前景。

附图说明

图1为Ssi蛋白的结构域示意图。

图2为Ssi-ancBE4max的蛋白结构域示意图。

图3为Ssi-ancBE4max的质粒结构示意图谱。

图4为Ssi-ancBE4max系统的gRNA的质粒结构示意图谱。

图5为Ssi-ancBE4max为本发明实施例3实验结果示意图。其中图5中A图为Ssi2位点的编辑结果,图5中B图为Ssi6位点的编辑结果,图5中C图为Ssi8的位点的编辑结果,图5中D图为Ssi10位点的编辑结果。

图6为Ssi-ancBE4max编辑系统在HEK293T细胞中的编辑效率统计热图。虚线框为编辑窗口示意图。

具体实施方式

以下将结合实施例对本发明的构思及产生的技术效果进行清楚、完整地描述,以充分地理解本发明的目的、特征和效果。显然,所描述的实施例只是本发明的一部分实施例,而不是全部实施例,基于本发明的实施例,本领域的技术人员在不付出创造性劳动的前提下所获得的其他实施例,均属于本发明保护的范围。

实施例1

将中华链球菌来源的Cas9蛋白同源物SsiCas9与SpCas9进行氨基酸序列比对,划分出SsiCas9的功能结构域,其结构域如图1所示,并找出SsiCas9的RuvC域功能位点(9位的天冬氨酸D9)并将其突变成丙氨酸(A),从而获得SsiCas9 D9A nickase,其氨基酸序列如SEQ ID NO.1所示。

将中华链球菌SsiCas9 D9A的原核密码子进行真核优化,从而获得适合真核细胞表达的SsiCas9 D9A的编码DNA序列,SEQ ID NO.2所示。优化以后的SsiCas9 D9A商业公司全基因合成。构建策略是在ancBE4max的基础上,将ancBE4max的SpCas9 D10A替换为SsiCas9D9A,其中ancBE4max由商业公司全基因合成。下一步,我们将ancBE4max的部分XTENlinker-SpCas9 D10A-10aa linker-UGI通过内切酶BamHI酶切除,然后在商业公司合成SsiCas9D9A时补上ancBE4max被切除的部分即部分XTEN linker-SsiCas9 D9A-10aalinker-UGI(序列两端带内切酶BamHI酶切位点),如SEQ ID NO.10所示。

通过限制性内切酶BamHI(R0136L)酶切AncBE4max(载体为pCMV)质粒,酶切反应的条件是37℃的水浴酶切2h,酶切体系(50μl)为:10xBuffer:5μl,载体:5μg,BamHI酶:3μl,ddH2O:加至50μl;通过凝胶电泳鉴定是否酶切完全;酶切完全后利用clean up试剂盒(AxyPrep PCR清洁试剂盒)纯化线性化载体,用15μl ddH2O洗脱。将合成的XTEN linker-SsiCas9 D9A-10aa linker-UGI进行PCR扩增,并在两端酶切位点外引入保护碱基,利用由金唯智生物科技有限公司合成的PCR引物,其中引物序列为:

Ssi PCR for:5’-agcggaggatcctctggcagcgagacacca-3’(SEQ ID NO.11);

Ssi PCR rev:5’-cctccggatcctccgctcagcatcttgatctta-3’(SEQ ID NO.12)。

进行PCR反应扩增载体片段,并利用clean up试剂盒(AxyPrep PCR清洁试剂盒)纯化。纯化以后的PCR产物进行BamH1酶切反应,酶切体系参照上述体系。

将纯化的XTEN linker-SsiCas9 D9A-10aa linker-UGI与BamH1线性化载体pCMV_ancBE4max酶连获得初步连接产物。连接体系(10μl)为:纯化线性化载体pCMV_ancBE4max:1μl(50ng),XTEN linker-SsiCas9 D9A-10aa linker-UGI BamH1酶切产物:1μl(100ng),T4DNA Ligase Buffer:1μl,T4 DNA Ligase:1μl,ddH2O:6μl;酶连条件是16度连接2h。酶连产物转化后涂板,挑取单克隆摇菌测序和克隆鉴定,构建得到SsiCas9-ancBE4max的蛋白和DNA序列分别如SEQ ID NO.5和SEQ ID NO.6所示。自N端至C端依次包括BPNLS、ancAPOBEC1多肽片段、32aa linker、SsiCas9 D9A nickase的N端第2~1122个氨基酸组成的多肽片段、10aa linker、2*UGI多肽和BPNLS多肽序列依次融合而成。其中BPNLS核定位信号多肽片段的氨基酸序列如SEQ ID NO.9所示;其中ancAPOBEC1多肽氨基酸序列如SEQ ID NO.3所示,UGI多肽氨基酸序列SEQ ID NO.4所示,SsiCas9-acnBE4max氨基酸序列如SEQ ID NO.5所示,SsiCas9-acnBE4max氨基酸序列对应的DNA编码序列如SEQ ID NO.6所示。

构建成功的质粒结构域示意图如图2所示,质粒结构图谱如图3所示。

鉴定阳性的单克隆经过菌液扩大培养,按照试剂盒步骤抽提质粒(TIANGEN:TIANpure Midi Plasmid Kit)并测浓度,确保转染时用量足够且没有盐和蛋白等杂质污染。

实施例2

2.1SsiCas9-ancBE4max系统gRNA质粒的载体构建

以pGL3-U6-sgRNA(Addgene#51133)为表达骨架,构建适用于SsiCas9 gRNA编辑系统的gRNA表达载体。根据中华链球菌来源的串联重复序列,设计适用于SsiCas9 gRNA作用系统的scaffold序列,将pGL3-U6-sgRNA(Addgene#51133)的scaffold(适用于SpCas9)替换为SsiCas9 gRNA scaffold(适用于SsiCas9),构建成功的完整质粒如SEQ ID NO.7所示,命名为pGL3-U6-Ssi gRNA,其质粒结构示意图见图4。连接入靶向gRNA序列的酶切位点为两个BsaI酶切位点,质粒由商业公司全基因合成。

2.2SsiCas9-ancBE4max系统靶向gRNA质粒的构建

设计gRNA并合成两条互补配对的oligos,上游序列为:5’-accg-20nt-3’,下游序列为:5’-aaac-20nt-3’(20nt的下游可替换序列与上游20nt可替换序列互补配对),上游序列为20nt-NHAAAA(PAM所在DNA链)。合成的上下游序列通过程序(95℃,5min;95℃-85℃降温速度-2℃/s;85℃-25℃降温速度-0.1℃/s;4℃保持)退火,连接到经过BsaI(NEB:R0539L)线性化的pGL3-U6-Ssi gRNA载体上。

线性化酶切体系如下所示:pGL3-U6-Ssi gRNA 2μg;buffer(NEB:R0539L)6μL;BsaI 2μL;ddH2O补齐到60μL。37℃酶切过夜。连接体系如下:T4连接buffer(NEB:M0202L)1μL,线性化载体20ng,退火的oligo片段(10μM)5μL,T4 DNA连接酶(NEB:M0202L)0.5μL,ddH2O补齐到10μL。16℃连接过夜。连接的载体通过转化,挑菌和鉴定。对阳性克隆扩增提取质粒(Axygene:AP-MN-P-250G)并测定浓度。

挑选人内源基因EMX1、RUNX1、DNMT1、AARSD1、GMPR2、ABCD3和NFYB等,共设计19条gRNA,合成20条Oligos,序列见表1。

表1 Oligos序列

sgSsi-1 for 5’-ACCGtgggcaagagtttctgccac-3’(SEQ ID NO.13)
sgSsi-1 rev 5’-AAACgtggcagaaactcttgccca-3’(SEQ ID NO.14)
sgSsi-2 for 5’-ACCGctgcgttcctagaaccacag-3’(SEQ ID NO.15)
sgSsi-2 rev 5’-AAACctgtggttctaggaacgcag-3’(SEQ ID NO.16)
sgSsi-3 for 5’-ACCGaatgctggctacagatgtcc-3’(SEQ ID NO.17)
sgSsi-3 rev 5’-AAACggacatctgtagccagcatt-3’(SEQ ID NO.18)
sgSsi-4 for 5’-ACCGctcatatgtcacttacctct-3’(SEQ ID NO.19)
sgSsi-4 rev 5’-AAACagaggtaagtgacatatgag-3’(SEQ ID NO.20)
sgSsi-5 for 5’-ACCGgagacaggatctcactgtgt-3’(SEQ ID NO.21)
sgSsi-5 rev 5’-AAACacacagtgagatcctgtctc-3’(SEQ ID NO.22)
sgSsi-6 for 5’-ACCGtgctctaggtggtgttaatg-3’(SEQ ID NO.23)
sgSsi-6 rev 5’-AAACcattaacaccacctagagca-3’(SEQ ID NO.24)
sgSsi-7 for 5’-ACCGcagcaacatgaacaactgaa-3’(SEQ ID NO.25)
sgSsi-7 rev 5’-AAACttcagttgttcatgttgctg-3’(SEQ ID NO.26)
sgSsi-8 for 5’-ACCGaagagccaagtcttactgta-3’(SEQ ID NO.27)
sgSsi-8 rev 5’-AAACtacagtaagacttggctctt-3’(SEQ ID NO.28)
sgSsi-9 for 5’-ACCGctgacaagtactagcttatg-3’(SEQ ID NO.29)
sgSsi-9 rev 5’-AAACcataagctagtacttgtcag-3’(SEQ ID NO.30)
sgSsi-10 for 5’-ACCGttcctcatagcaacatcact-3’(SEQ ID NO.31)
sgSsi-10 rev 5’-AAACagtgatgttgctatgaggaa-3’(SEQ ID NO.32)

实施例3

利用上述实施例构建的SsiCas9-ancBE4max质粒和pGL3-U6-Ssi gRNA质粒构成的碱基编辑系统转染HEK293T细胞,过程如下:

3.1HEK293T细胞(来自ATCC)复苏,在10cm培养皿(Corning,430167)中培养,培养基为混有10%的胎牛血清(HyClone,SV30087)的DMEM(HyClone,SH30243.01)。培养温度为37℃,二氧化碳浓度为5%。多次传代后当细胞密度为90%时,细胞分盘至24孔板。

3.2HEK293T细胞复苏三代后观察细胞状态,将状态良好的细胞铺板24孔板中,铺板细胞培养18-24h后,当细胞浓度为80%时对其进行转染,转染过程中各成分的用量:SsiCas9-ancBE4max质粒1μg,pGL3-U6-Ssi gRNA质粒:0.5μg,EZTrans转染试剂(李记生物)4.5μl。

3.3具体转染步骤(同上海李记生物EZ Trans转染试剂高效版步骤)为:

3.3.1配置A试剂:对于每孔细胞,将1.5μg质粒DNA(1μg SsiCas9-ancBE4max质粒+0.5μg pGL3-U6-Ssi gRNA质粒)稀释到50μl无血清无双抗的高糖DMEM培养基(或者OPTI-MEM培养基),混匀。

3.3.2配置B试剂:对于每孔细胞,将4.5μl EZ Trans转染试剂(EZ Trans:质粒DNA=2:1)稀释到50μl无血清无双抗的高糖DMEM培养基(或者OPTI-MEMⅠ培养基),轻轻混匀。此步骤不能使用含血清的培养基稀释质粒和EZ Trans转染试剂,因为血清含有大量的带负电蛋白质,可能干扰转染试剂对核酸的吸附,从而影响转染效率。

3.3.3A试剂和B试剂同时静置5min,将B试剂尽快全部加入到A试剂中,轻轻混匀。混合的顺序不能颠倒进行。

3.3.4室温静置15min,以形成EZ Trans-DNA复合物。将配置好的EZ Trans-DNA转染复合物全部均匀滴入到含细胞的培养皿中,轻轻晃动培养皿或轻微振荡,让EZ Trans-DNA复合物分散均匀。

3.3.5在37℃,5%CO2培养箱培养4~6h,去除含EZ Trans-DNA复合物的培养液,更换新的培养液,培养3天。

3.4转染的细胞培养3天后用胰酶消化细胞获取细胞,进一步通过流式分选获取GFP阳性的细胞(FITC荧光强度top 15%),收取的细胞利用酚氯仿法抽取基因组DNA。

3.5以选取的内源基因靶向位点上下游各100-130bp分别设计并合成PCR引物,加水稀释至10μM。用诺唯赞高保真酶试剂盒(Vazyme,p501-d2)PCR扩增各基因组靶向位点片段。PCR产物样品用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收,去除非特异性条带。PCR引物序列如表2所示。

表2PCR引物序列

3.6通过凝胶电泳初步鉴定目的片段是否扩增成功,扩增成功的目的片段进行Sanger测序,分析测序结果观察靶位点是否存在特定碱基点突变(C-to-T或G-to-A)。

测序结果见附图5,其中图5中A图为Ssi2位点的编辑结果,图5中B图为Ssi6位点的编辑结果,图5中C图为Ssi8的位点的编辑结果,图5中D图为Ssi10位点的编辑结果;其中图5中A图~D图的左图第一列为靶向DNA序列示意图;第二列为PAM序列;图右方为对应靶向位点的编辑结果效率统计图。右图为gRNA范围内不同位置C-to-T的编辑效率统计结果。图5中共展示了4个编辑位点的编辑结果,分别为Ssi2、Ssi6、Ssi8、Ssi10,由图5可见,本实施例1获得的基因编辑工具SsiCas9-ancBE4max可导致高效的C-to-T转换。而且在HEK293T细胞中,共计测试了10个内源人类基因组位点,结果见图6,发现SsiCas9-ancBE4max均可导致高效的C-to-T转换,并且编辑范围主要在gRNA序列范围内的3-12位,拓宽了碱基编辑器的靶向范围。

上述具体实施方式对本发明作了详细说明,但是本发明不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下作出各种变化。此外,在不冲突的情况下,本发明的实施例及实施例中的特征可以相互组合。

SEQUENCE LISTING

<110> 广州大学

<120> 一种融合蛋白、碱基编辑工具及其应用

<130>

<160> 52

<170> PatentIn version 3.5

<210> 1

<211> 1121

<212> PRT

<213> 人工序列

<400> 1

Asn Gly Lys Ile Leu Gly Leu Ala Ile Gly Val Ala Ser Val Gly Val

1 5 10 15

Gly Ile Leu Asp Lys Lys Thr Gly Glu Ile Ile His Ala Ser Ser Arg

20 25 30

Ile Phe Pro Ala Ala Thr Ala Asp Ser Asn Val Glu Arg Arg Gly Phe

35 40 45

Arg Gln Gly Arg Arg Leu Gly Arg Arg Lys Lys His Arg Lys Val Arg

50 55 60

Leu Ala Asp Leu Phe Ser Asp Thr Gly Leu Ile Thr Asp Phe Ser Lys

65 70 75 80

Val Ser Ile Asn Leu Asn Pro Tyr Glu Leu Arg Ile Lys Gly Leu Asn

85 90 95

Glu Lys Leu Thr Asn Glu Glu Leu Phe Ile Ala Leu Lys Asn Ile Val

100 105 110

Lys Arg Arg Gly Ile Ser Tyr Leu Asp Asp Ala Asn Glu Asp Gly Glu

115 120 125

Ser Ser Ser Ser Glu Tyr Gly Lys Ala Val Glu Glu Asn Arg Lys Leu

130 135 140

Leu Ala Asp Lys Thr Pro Gly Gln Ile Gln Leu Glu Arg Phe Glu Lys

145 150 155 160

Tyr Gly Gln Val Arg Gly Asp Phe Thr Ile Glu Glu Asn Gly Glu Lys

165 170 175

His Arg Leu Leu Asn Val Phe Ser Thr Ser Ala Tyr Lys Lys Glu Ala

180 185 190

Glu Arg Ile Leu Thr Lys Gln Gln Asp Tyr Asn Gln Asp Ile Thr Asp

195 200 205

Glu Phe Ile Gln Ala Tyr Leu Thr Ile Leu Thr Gly Lys Arg Lys Tyr

210 215 220

Tyr His Gly Pro Gly Asn Glu Lys Ser Arg Thr Asp Tyr Gly Arg Phe

225 230 235 240

Arg Thr Asp Gly Thr Thr Leu Asp Asn Ile Phe Gly Ile Leu Ile Gly

245 250 255

Lys Cys Thr Phe Tyr Pro Glu Glu Tyr Arg Ala Ala Lys Ala Ser Tyr

260 265 270

Thr Ala Gln Glu Phe Asn Leu Leu Asn Asp Leu Asn Asn Leu Thr Val

275 280 285

Pro Thr Glu Thr Lys Lys Leu Ser Glu Glu Gln Lys Arg Gln Ile Ile

290 295 300

Glu Tyr Ala Lys Gly Ala Lys Thr Leu Gly Ala Ala Thr Leu Leu Lys

305 310 315 320

Tyr Ile Ala Lys Leu Val Asp Gly Ser Val Glu Asp Ile Lys Gly Tyr

325 330 335

Arg Ile Asp Lys Ser Glu Lys Pro Glu Met His Thr Phe Asp Ile Tyr

340 345 350

Arg Lys Met Gln Thr Leu Glu Thr Val Asp Val Glu Lys Leu Ser Arg

355 360 365

Glu Val Leu Asp Glu Leu Ala His Ile Leu Thr Leu Asn Thr Glu Arg

370 375 380

Glu Gly Ile Glu Glu Ala Ile Lys Val Ser Phe Ile Lys Arg Glu Phe

385 390 395 400

Glu Gln Asp Gln Ile Ala Glu Leu Val Ser Phe Arg Lys Ser Asn Ser

405 410 415

Ser Leu Phe Gly Lys Gly Trp His Asn Phe Ser Ile Lys Leu Met Thr

420 425 430

Glu Leu Ile Pro Glu Leu Tyr Glu Thr Ser Glu Glu Gln Met Thr Ile

435 440 445

Leu Thr Arg Leu Gly Lys Gln Lys Thr Lys Ala Arg Ser Lys Arg Thr

450 455 460

Lys Tyr Ile Asp Glu Lys Glu Leu Thr Asp Glu Ile Tyr Asn Pro Val

465 470 475 480

Val Ala Lys Ser Val Arg Gln Ala Ile Lys Ile Ile Asn Leu Ala Thr

485 490 495

Lys Lys Tyr Gly Val Phe Asp Asn Ile Val Ile Glu Met Ala Arg Glu

500 505 510

Asn Asn Glu Glu Asp Ala Lys Lys Asp Tyr Val Lys Arg Gln Lys Ala

515 520 525

Asn Glu Asp Glu Lys Asn Ala Ala Met Glu Lys Ala Ala His Gln Tyr

530 535 540

Asn Gly Lys Lys Glu Leu Pro Asp Asn Val Phe His Gly His Lys Glu

545 550 555 560

Leu Ala Thr Lys Ile Arg Leu Trp His Gln Gln Gly Glu Lys Cys Leu

565 570 575

Tyr Thr Gly Lys Asn Ile Pro Ile Ser Asp Leu Ile His Asn Gln Tyr

580 585 590

Lys Tyr Glu Ile Asp His Ile Leu Pro Leu Ser Leu Ser Phe Asp Asp

595 600 605

Ser Leu Ala Asn Lys Val Leu Val Leu Ala Thr Ala Asn Gln Glu Lys

610 615 620

Gly Gln Arg Thr Pro Phe Gln Ala Leu Asp Ser Met Asp Asp Ala Trp

625 630 635 640

Ser Tyr Arg Glu Phe Lys Ala Tyr Val Arg Gly Ala Arg Ala Leu Ser

645 650 655

Asn Lys Lys Lys Asp Tyr Leu Leu Asn Glu Glu Asp Ile Asn Lys Ile

660 665 670

Glu Val Lys Gln Lys Phe Ile Glu Arg Asn Leu Val Asp Thr Arg Tyr

675 680 685

Ser Ser Arg Val Val Leu Asn Ala Leu Gln Asp Phe Tyr Lys Leu Asn

690 695 700

Asp Phe Asp Thr Lys Ile Ser Val Val Arg Gly Gln Phe Thr Ser Gln

705 710 715 720

Leu Arg Arg Lys Trp Arg Ile Asp Lys Ser Arg Glu Thr Tyr His His

725 730 735

His Ala Val Asp Ala Leu Ile Ile Ala Ala Ser Ser Gln Leu Arg Leu

740 745 750

Trp Lys Lys Gln Gly Asn Pro Leu Ile Ser Tyr Lys Glu Asn Gln Phe

755 760 765

Val Asp Ser Glu Thr Gly Glu Ile Ile Ser Leu Thr Asp Asp Glu Tyr

770 775 780

Lys Glu Leu Val Phe Arg Ala Pro Tyr Asp His Phe Val Asp Thr Val

785 790 795 800

Ser Ser Lys Lys Phe Glu Asp Arg Ile Leu Phe Ser Tyr Gln Val Asp

805 810 815

Ser Lys Tyr Asn Arg Lys Ile Ser Asp Ala Thr Ile Tyr Ser Thr Arg

820 825 830

Lys Ala Lys Leu Gly Lys Asp Lys Ser Glu Glu Thr Tyr Val Leu Gly

835 840 845

Lys Ile Lys Asp Ile Tyr Thr Gln Thr Gly Tyr Asp Ala Phe Ile Lys

850 855 860

Leu Tyr Lys Lys Asp Lys Ser Lys Phe Leu Met Tyr His Lys Asp Pro

865 870 875 880

Ile Thr Phe Glu Lys Val Ile Glu Glu Ile Leu Lys Thr Tyr Pro Asp

885 890 895

Lys Glu Ile Asn Glu Lys Gly Lys Glu Val Ala Cys Asn Pro Phe Glu

900 905 910

Lys Tyr Arg Gln Glu Asn Gly Pro Leu Arg Lys Tyr Ser Lys Lys Gly

915 920 925

Lys Gly Pro Glu Ile Lys Ser Leu Lys Tyr Tyr Asp Asn Lys Leu Gly

930 935 940

Asn His Ile Asp Ile Thr Pro Asp Asn Ser Glu Asn Gln Val Ile Leu

945 950 955 960

Gln Ser Leu Lys Pro Trp Arg Thr Asp Val Tyr Phe Asn His Lys Thr

965 970 975

Lys Ile Tyr Glu Leu Met Gly Leu Lys Tyr Ser Asp Leu Ser Phe Glu

980 985 990

Lys Gly Ser Gly Lys Tyr Arg Ile Ser Leu Asp Lys Tyr Asn Val Ile

995 1000 1005

Lys Lys Lys Glu Gly Val His Lys Glu Ser Glu Phe Lys Phe Thr

1010 1015 1020

Leu Tyr Lys Asn Asp Leu Ile Leu Ile Lys Asp Leu Glu Lys Ser

1025 1030 1035

Glu Gln Gln Leu Phe Arg Tyr Asn Ser Arg Asn Asp Thr Ser Lys

1040 1045 1050

His Tyr Val Glu Leu Lys Pro Tyr Asp Lys Ala Lys Phe Glu Gly

1055 1060 1065

Asn Gln Pro Leu Met Ala Leu Phe Gly Asn Val Ala Lys Gly Gly

1070 1075 1080

Gln Cys Leu Lys Gly Leu Asn Lys Ala Asn Ile Ser Ile Tyr Lys

1085 1090 1095

Val Gln Thr Asp Val Leu Gly Asn Lys Arg Phe Ile Lys Lys Glu

1100 1105 1110

Gly Asp Ala Pro Lys Leu Glu Phe

1115 1120

<210> 2

<211> 3363

<212> DNA

<213> 人工序列

<400> 2

aacggcaaga tcctgggact ggccatcgga gttgcatctg ttggagtggg catcctggac 60

aagaagaccg gcgagatcat ccacgccagc agcagaatct tccccgccgc cacagccgat 120

agcaacgtgg aacggagggg cttcagacag ggaagacggc tgggccgtag aaaaaaacac 180

agaaaggtgc ggttggccga tctgttcagc gacaccggcc tgataacaga cttctctaaa 240

gtgtctatca acctgaaccc ctacgagctg cggatcaagg gcctcaatga gaaactgaca 300

aacgaggaac tgttcatcgc cctgaagaac atcgtgaaga gaagaggcat cagctacctg 360

gatgacgcca atgaggacgg cgagagctcc tctagcgagt acggcaaggc tgtggaagaa 420

aaccgaaagt tgctggccga caagactcct ggccagatcc agctggaacg cttcgaaaag 480

tacggacagg tccgaggaga tttcaccatc gaggaaaacg gcgaaaagca tagactgctg 540

aacgtgttca gcaccagcgc ctataagaaa gaagccgagc ggattctgac caagcagcaa 600

gattacaacc aagacatcac cgacgagttc atccaggcct acctgacaat cctgacggga 660

aagagaaagt actaccatgg ccccggcaac gagaagtcta gaaccgacta cggccggttc 720

aggaccgatg gcaccaccct ggacaacatc tttggcatcc tgatcggcaa atgtacattc 780

tacccagagg agtaccgggc ggccaaggcc tcttacaccg cccaggagtt taacctcctg 840

aatgacctga acaatctgac agttccaacc gagacaaaga aactgagcga ggaacagaag 900

cggcaaatca tcgagtacgc caagggagcc aagacacttg gagccgccac cctgctcaag 960

tacatcgcca agctggtgga cggctctgtg gaggatatca agggctatag aattgataaa 1020

agcgagaaac ctgagatgca cacattcgat atctacagaa agatgcagac actggaaacc 1080

gtggatgtgg aaaagctgtc acgcgaggtg ctggatgagc tggcccatat cctgacactg 1140

aataccgaga gagaaggtat cgaggaggcc atcaaggtca gctttatcaa gagagagttc 1200

gaacaggacc agatcgccga gctggtcagc ttccggaagt ccaactctag cctgtttggc 1260

aagggctggc acaacttcag tatcaaactg atgacagaac tgatccccga gctgtatgag 1320

accagcgaag agcagatgac catcctgacc agactgggaa agcaaaagac aaaggctaga 1380

agcaagcgca caaagtacat cgacgagaag gagctgaccg acgagatcta caaccccgtg 1440

gtggccaaga gcgtgagaca ggccattaag atcatcaacc tggccaccaa gaagtacggc 1500

gtgttcgaca acatcgtgat cgagatggcc agagagaaca acgaggagga tgccaagaaa 1560

gattacgtga aaagacaaaa agctaatgag gacgaaaaga acgccgctat ggaaaaggct 1620

gcccaccagt acaacggcaa gaaggagctg cccgataacg tgtttcacgg ccacaaggaa 1680

ctggccacaa agatcagact gtggcaccag cagggcgaga agtgcctgta caccggcaaa 1740

aacatcccta tctctgatct gatccacaac cagtataagt acgagatcga ccacatcctg 1800

cctctgtcac tgagcttcga cgacagcctg gccaataagg tgctggtgct cgctaccgcc 1860

aaccaggaga agggccaaag aacacctttc caggccctcg acagcatgga cgatgcgtgg 1920

tcctatagag aatttaaggc ctacgtgcgg ggcgccagag ccctgagcaa caagaaaaaa 1980

gattacctgc tgaatgaaga ggacatcaac aagatcgaag tgaagcagaa attcatcgag 2040

aggaaccttg tggacactcg gtactcctct agagtggtcc tgaacgccct gcaggacttc 2100

tacaagctga atgatttcga caccaagatc agcgtggtga gaggccagtt caccagccag 2160

ctgagacgga aatggagaat cgacaagagc agagaaacct accaccacca cgccgtggac 2220

gctctgatca ttgccgctag ctcgcagctg agactgtgga agaagcaggg caacccactg 2280

atcagctaca aggaaaacca gttcgtcgac tccgaaaccg gagaaattat cagcctcaca 2340

gatgatgaat acaaggaact ggtgttccgg gctccatacg accacttcgt ggacacagtg 2400

agcagcaaaa agtttgaaga cagaatcctt ttctcctacc aggtggattc caaatacaac 2460

cggaaaatca gcgacgccac catttactct accagaaagg ccaagctggg caaagacaag 2520

agcgaggaaa cctacgtgct gggcaagata aaggacatct acacccagac cggctacgat 2580

gccttcatca agctgtacaa gaaggacaag tccaaatttc tgatgtacca caaggatcct 2640

atcacctttg agaaggtgat cgaggaaatc ctgaagacct accccgacaa ggaaatcaac 2700

gagaagggca aggaagtggc atgcaaccct tttgaaaaat atagacagga gaatggacct 2760

ctgagaaagt attctaagaa aggtaagggc cctgagatca agagcctgaa gtactacgac 2820

aacaaactcg gcaaccacat cgacataacc cctgacaaca gcgaaaatca ggtgatcctc 2880

cagtccctga aaccttggcg gaccgacgtg tacttcaacc acaaaaccaa gatttatgag 2940

ctgatgggcc tgaagtacag cgacctgagc ttcgagaagg gcagcggcaa gtaccggatt 3000

agcctggaca aatataacgt gatcaagaaa aaggagggcg tgcacaagga aagcgagttc 3060

aagttcacac tgtacaagaa cgacctgatc ctaatcaagg atctggaaaa gagcgagcag 3120

cagctgttta gatacaacag ccggaacgat acatccaagc actacgtgga gctgaagcct 3180

tacgacaagg ccaaattcga gggaaatcaa cctctgatgg ccctgttcgg caatgtggcc 3240

aagggaggcc agtgcctgaa gggcctgaac aaagccaaca tcagcatcta caaggtgcag 3300

accgacgtgc tgggcaacaa gcggttcatc aagaaagaag gcgacgctcc taagctggaa 3360

ttt 3363

<210> 3

<211> 228

<212> PRT

<213> 人工序列

<400> 3

Ser Ser Glu Thr Gly Pro Val Ala Val Asp Pro Thr Leu Arg Arg Arg

1 5 10 15

Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp Pro Arg Glu Leu Arg

20 25 30

Lys Glu Thr Cys Leu Leu Tyr Glu Ile Lys Trp Gly Thr Ser His Lys

35 40 45

Ile Trp Arg His Ser Ser Lys Asn Thr Thr Lys His Val Glu Val Asn

50 55 60

Phe Ile Glu Lys Phe Thr Ser Glu Arg His Phe Cys Pro Ser Thr Ser

65 70 75 80

Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro Cys Gly Glu Cys Ser

85 90 95

Lys Ala Ile Thr Glu Phe Leu Ser Gln His Pro Asn Val Thr Leu Val

100 105 110

Ile Tyr Val Ala Arg Leu Tyr His His Met Asp Gln Gln Asn Arg Gln

115 120 125

Gly Leu Arg Asp Leu Val Asn Ser Gly Val Thr Ile Gln Ile Met Thr

130 135 140

Ala Pro Glu Tyr Asp Tyr Cys Trp Arg Asn Phe Val Asn Tyr Pro Pro

145 150 155 160

Gly Lys Glu Ala His Trp Pro Arg Tyr Pro Pro Leu Trp Met Lys Leu

165 170 175

Tyr Ala Leu Glu Leu His Ala Gly Ile Leu Gly Leu Pro Pro Cys Leu

180 185 190

Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr Phe Phe Thr Ile Ala

195 200 205

Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro His Ile Leu Trp Ala

210 215 220

Thr Gly Leu Lys

225

<210> 4

<211> 190

<212> PRT

<213> 人工序列

<400> 4

Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val

1 5 10 15

Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile

20 25 30

Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu

35 40 45

Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr

50 55 60

Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile

65 70 75 80

Lys Met Leu Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Thr Asn Leu

85 90 95

Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln Glu

100 105 110

Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys

115 120 125

Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp

130 135 140

Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp

145 150 155 160

Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu

165 170 175

Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu

180 185 190

<210> 5

<211> 1595

<212> PRT

<213> 人工序列

<400> 5

Pro Lys Lys Lys Arg Lys Val Ser Ser Glu Thr Gly Pro Val Ala Val

1 5 10 15

Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe

20 25 30

Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile

35 40 45

Lys Trp Gly Thr Ser His Lys Ile Trp Arg His Ser Ser Lys Asn Thr

50 55 60

Thr Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Ser Glu Arg

65 70 75 80

His Phe Cys Pro Ser Thr Ser Cys Ser Ile Thr Trp Phe Leu Ser Trp

85 90 95

Ser Pro Cys Gly Glu Cys Ser Lys Ala Ile Thr Glu Phe Leu Ser Gln

100 105 110

His Pro Asn Val Thr Leu Val Ile Tyr Val Ala Arg Leu Tyr His His

115 120 125

Met Asp Gln Gln Asn Arg Gln Gly Leu Arg Asp Leu Val Asn Ser Gly

130 135 140

Val Thr Ile Gln Ile Met Thr Ala Pro Glu Tyr Asp Tyr Cys Trp Arg

145 150 155 160

Asn Phe Val Asn Tyr Pro Pro Gly Lys Glu Ala His Trp Pro Arg Tyr

165 170 175

Pro Pro Leu Trp Met Lys Leu Tyr Ala Leu Glu Leu His Ala Gly Ile

180 185 190

Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln

195 200 205

Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu

210 215 220

Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Gly Ser Ser

225 230 235 240

Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr

245 250 255

Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Asn Gly Lys Ile Leu

260 265 270

Gly Leu Ala Ile Gly Val Ala Ser Val Gly Val Gly Ile Leu Asp Lys

275 280 285

Lys Thr Gly Glu Ile Ile His Ala Ser Ser Arg Ile Phe Pro Ala Ala

290 295 300

Thr Ala Asp Ser Asn Val Glu Arg Arg Gly Phe Arg Gln Gly Arg Arg

305 310 315 320

Leu Gly Arg Arg Lys Lys His Arg Lys Val Arg Leu Ala Asp Leu Phe

325 330 335

Ser Asp Thr Gly Leu Ile Thr Asp Phe Ser Lys Val Ser Ile Asn Leu

340 345 350

Asn Pro Tyr Glu Leu Arg Ile Lys Gly Leu Asn Glu Lys Leu Thr Asn

355 360 365

Glu Glu Leu Phe Ile Ala Leu Lys Asn Ile Val Lys Arg Arg Gly Ile

370 375 380

Ser Tyr Leu Asp Asp Ala Asn Glu Asp Gly Glu Ser Ser Ser Ser Glu

385 390 395 400

Tyr Gly Lys Ala Val Glu Glu Asn Arg Lys Leu Leu Ala Asp Lys Thr

405 410 415

Pro Gly Gln Ile Gln Leu Glu Arg Phe Glu Lys Tyr Gly Gln Val Arg

420 425 430

Gly Asp Phe Thr Ile Glu Glu Asn Gly Glu Lys His Arg Leu Leu Asn

435 440 445

Val Phe Ser Thr Ser Ala Tyr Lys Lys Glu Ala Glu Arg Ile Leu Thr

450 455 460

Lys Gln Gln Asp Tyr Asn Gln Asp Ile Thr Asp Glu Phe Ile Gln Ala

465 470 475 480

Tyr Leu Thr Ile Leu Thr Gly Lys Arg Lys Tyr Tyr His Gly Pro Gly

485 490 495

Asn Glu Lys Ser Arg Thr Asp Tyr Gly Arg Phe Arg Thr Asp Gly Thr

500 505 510

Thr Leu Asp Asn Ile Phe Gly Ile Leu Ile Gly Lys Cys Thr Phe Tyr

515 520 525

Pro Glu Glu Tyr Arg Ala Ala Lys Ala Ser Tyr Thr Ala Gln Glu Phe

530 535 540

Asn Leu Leu Asn Asp Leu Asn Asn Leu Thr Val Pro Thr Glu Thr Lys

545 550 555 560

Lys Leu Ser Glu Glu Gln Lys Arg Gln Ile Ile Glu Tyr Ala Lys Gly

565 570 575

Ala Lys Thr Leu Gly Ala Ala Thr Leu Leu Lys Tyr Ile Ala Lys Leu

580 585 590

Val Asp Gly Ser Val Glu Asp Ile Lys Gly Tyr Arg Ile Asp Lys Ser

595 600 605

Glu Lys Pro Glu Met His Thr Phe Asp Ile Tyr Arg Lys Met Gln Thr

610 615 620

Leu Glu Thr Val Asp Val Glu Lys Leu Ser Arg Glu Val Leu Asp Glu

625 630 635 640

Leu Ala His Ile Leu Thr Leu Asn Thr Glu Arg Glu Gly Ile Glu Glu

645 650 655

Ala Ile Lys Val Ser Phe Ile Lys Arg Glu Phe Glu Gln Asp Gln Ile

660 665 670

Ala Glu Leu Val Ser Phe Arg Lys Ser Asn Ser Ser Leu Phe Gly Lys

675 680 685

Gly Trp His Asn Phe Ser Ile Lys Leu Met Thr Glu Leu Ile Pro Glu

690 695 700

Leu Tyr Glu Thr Ser Glu Glu Gln Met Thr Ile Leu Thr Arg Leu Gly

705 710 715 720

Lys Gln Lys Thr Lys Ala Arg Ser Lys Arg Thr Lys Tyr Ile Asp Glu

725 730 735

Lys Glu Leu Thr Asp Glu Ile Tyr Asn Pro Val Val Ala Lys Ser Val

740 745 750

Arg Gln Ala Ile Lys Ile Ile Asn Leu Ala Thr Lys Lys Tyr Gly Val

755 760 765

Phe Asp Asn Ile Val Ile Glu Met Ala Arg Glu Asn Asn Glu Glu Asp

770 775 780

Ala Lys Lys Asp Tyr Val Lys Arg Gln Lys Ala Asn Glu Asp Glu Lys

785 790 795 800

Asn Ala Ala Met Glu Lys Ala Ala His Gln Tyr Asn Gly Lys Lys Glu

805 810 815

Leu Pro Asp Asn Val Phe His Gly His Lys Glu Leu Ala Thr Lys Ile

820 825 830

Arg Leu Trp His Gln Gln Gly Glu Lys Cys Leu Tyr Thr Gly Lys Asn

835 840 845

Ile Pro Ile Ser Asp Leu Ile His Asn Gln Tyr Lys Tyr Glu Ile Asp

850 855 860

His Ile Leu Pro Leu Ser Leu Ser Phe Asp Asp Ser Leu Ala Asn Lys

865 870 875 880

Val Leu Val Leu Ala Thr Ala Asn Gln Glu Lys Gly Gln Arg Thr Pro

885 890 895

Phe Gln Ala Leu Asp Ser Met Asp Asp Ala Trp Ser Tyr Arg Glu Phe

900 905 910

Lys Ala Tyr Val Arg Gly Ala Arg Ala Leu Ser Asn Lys Lys Lys Asp

915 920 925

Tyr Leu Leu Asn Glu Glu Asp Ile Asn Lys Ile Glu Val Lys Gln Lys

930 935 940

Phe Ile Glu Arg Asn Leu Val Asp Thr Arg Tyr Ser Ser Arg Val Val

945 950 955 960

Leu Asn Ala Leu Gln Asp Phe Tyr Lys Leu Asn Asp Phe Asp Thr Lys

965 970 975

Ile Ser Val Val Arg Gly Gln Phe Thr Ser Gln Leu Arg Arg Lys Trp

980 985 990

Arg Ile Asp Lys Ser Arg Glu Thr Tyr His His His Ala Val Asp Ala

995 1000 1005

Leu Ile Ile Ala Ala Ser Ser Gln Leu Arg Leu Trp Lys Lys Gln

1010 1015 1020

Gly Asn Pro Leu Ile Ser Tyr Lys Glu Asn Gln Phe Val Asp Ser

1025 1030 1035

Glu Thr Gly Glu Ile Ile Ser Leu Thr Asp Asp Glu Tyr Lys Glu

1040 1045 1050

Leu Val Phe Arg Ala Pro Tyr Asp His Phe Val Asp Thr Val Ser

1055 1060 1065

Ser Lys Lys Phe Glu Asp Arg Ile Leu Phe Ser Tyr Gln Val Asp

1070 1075 1080

Ser Lys Tyr Asn Arg Lys Ile Ser Asp Ala Thr Ile Tyr Ser Thr

1085 1090 1095

Arg Lys Ala Lys Leu Gly Lys Asp Lys Ser Glu Glu Thr Tyr Val

1100 1105 1110

Leu Gly Lys Ile Lys Asp Ile Tyr Thr Gln Thr Gly Tyr Asp Ala

1115 1120 1125

Phe Ile Lys Leu Tyr Lys Lys Asp Lys Ser Lys Phe Leu Met Tyr

1130 1135 1140

His Lys Asp Pro Ile Thr Phe Glu Lys Val Ile Glu Glu Ile Leu

1145 1150 1155

Lys Thr Tyr Pro Asp Lys Glu Ile Asn Glu Lys Gly Lys Glu Val

1160 1165 1170

Ala Cys Asn Pro Phe Glu Lys Tyr Arg Gln Glu Asn Gly Pro Leu

1175 1180 1185

Arg Lys Tyr Ser Lys Lys Gly Lys Gly Pro Glu Ile Lys Ser Leu

1190 1195 1200

Lys Tyr Tyr Asp Asn Lys Leu Gly Asn His Ile Asp Ile Thr Pro

1205 1210 1215

Asp Asn Ser Glu Asn Gln Val Ile Leu Gln Ser Leu Lys Pro Trp

1220 1225 1230

Arg Thr Asp Val Tyr Phe Asn His Lys Thr Lys Ile Tyr Glu Leu

1235 1240 1245

Met Gly Leu Lys Tyr Ser Asp Leu Ser Phe Glu Lys Gly Ser Gly

1250 1255 1260

Lys Tyr Arg Ile Ser Leu Asp Lys Tyr Asn Val Ile Lys Lys Lys

1265 1270 1275

Glu Gly Val His Lys Glu Ser Glu Phe Lys Phe Thr Leu Tyr Lys

1280 1285 1290

Asn Asp Leu Ile Leu Ile Lys Asp Leu Glu Lys Ser Glu Gln Gln

1295 1300 1305

Leu Phe Arg Tyr Asn Ser Arg Asn Asp Thr Ser Lys His Tyr Val

1310 1315 1320

Glu Leu Lys Pro Tyr Asp Lys Ala Lys Phe Glu Gly Asn Gln Pro

1325 1330 1335

Leu Met Ala Leu Phe Gly Asn Val Ala Lys Gly Gly Gln Cys Leu

1340 1345 1350

Lys Gly Leu Asn Lys Ala Asn Ile Ser Ile Tyr Lys Val Gln Thr

1355 1360 1365

Asp Val Leu Gly Asn Lys Arg Phe Ile Lys Lys Glu Gly Asp Ala

1370 1375 1380

Pro Lys Leu Glu Phe Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser

1385 1390 1395

Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu

1400 1405 1410

Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu

1415 1420 1425

Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala

1430 1435 1440

Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp

1445 1450 1455

Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn

1460 1465 1470

Gly Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Gly Gly Ser

1475 1480 1485

Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly

1490 1495 1500

Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu

1505 1510 1515

Val Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val

1520 1525 1530

His Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu

1535 1540 1545

Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln

1550 1555 1560

Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser

1565 1570 1575

Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg

1580 1585 1590

Lys Val

1595

<210> 6

<211> 4785

<212> DNA

<213> 人工序列

<400> 6

ccaaagaaga agcggaaagt cagcagtgaa accggaccag tggcagtgga cccaaccctg 60

aggagacgga ttgagcccca tgaatttgaa gtgttctttg acccaaggga gctgaggaag 120

gagacatgcc tgctgtacga gatcaagtgg ggcacaagcc acaagatctg gcgccacagc 180

tccaagaaca ccacaaagca cgtggaagtg aatttcatcg agaagtttac ctccgagcgg 240

cacttctgcc cctctaccag ctgttccatc acatggtttc tgtcttggag cccttgcggc 300

gagtgttcca aggccatcac cgagttcctg tctcagcacc ctaacgtgac cctggtcatc 360

tacgtggccc ggctgtatca ccacatggac cagcagaaca ggcagggcct gcgcgatctg 420

gtgaattctg gcgtgaccat ccagatcatg acagccccag agtacgacta ttgctggcgg 480

aacttcgtga attatccacc tggcaaggag gcacactggc caagataccc acccctgtgg 540

atgaagctgt atgcactgga gctgcacgca ggaatcctgg gcctgcctcc atgtctgaat 600

atcctgcgga gaaagcagcc ccagctgaca tttttcacca ttgctctgca gtcttgtcac 660

tatcagcggc tgcctcctca tattctgtgg gctacaggcc tgaagtctgg aggatctagc 720

ggaggatcct ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagcagt 780

ggcggcagca gcggcggcag caacggcaag atcctgggac tggccatcgg agttgcatct 840

gttggagtgg gcatcctgga caagaagacc ggcgagatca tccacgccag cagcagaatc 900

ttccccgccg ccacagccga tagcaacgtg gaacggaggg gcttcagaca gggaagacgg 960

ctgggccgta gaaaaaaaca cagaaaggtg cggttggccg atctgttcag cgacaccggc 1020

ctgataacag acttctctaa agtgtctatc aacctgaacc cctacgagct gcggatcaag 1080

ggcctcaatg agaaactgac aaacgaggaa ctgttcatcg ccctgaagaa catcgtgaag 1140

agaagaggca tcagctacct ggatgacgcc aatgaggacg gcgagagctc ctctagcgag 1200

tacggcaagg ctgtggaaga aaaccgaaag ttgctggccg acaagactcc tggccagatc 1260

cagctggaac gcttcgaaaa gtacggacag gtccgaggag atttcaccat cgaggaaaac 1320

ggcgaaaagc atagactgct gaacgtgttc agcaccagcg cctataagaa agaagccgag 1380

cggattctga ccaagcagca agattacaac caagacatca ccgacgagtt catccaggcc 1440

tacctgacaa tcctgacggg aaagagaaag tactaccatg gccccggcaa cgagaagtct 1500

agaaccgact acggccggtt caggaccgat ggcaccaccc tggacaacat ctttggcatc 1560

ctgatcggca aatgtacatt ctacccagag gagtaccggg cggccaaggc ctcttacacc 1620

gcccaggagt ttaacctcct gaatgacctg aacaatctga cagttccaac cgagacaaag 1680

aaactgagcg aggaacagaa gcggcaaatc atcgagtacg ccaagggagc caagacactt 1740

ggagccgcca ccctgctcaa gtacatcgcc aagctggtgg acggctctgt ggaggatatc 1800

aagggctata gaattgataa aagcgagaaa cctgagatgc acacattcga tatctacaga 1860

aagatgcaga cactggaaac cgtggatgtg gaaaagctgt cacgcgaggt gctggatgag 1920

ctggcccata tcctgacact gaataccgag agagaaggta tcgaggaggc catcaaggtc 1980

agctttatca agagagagtt cgaacaggac cagatcgccg agctggtcag cttccggaag 2040

tccaactcta gcctgtttgg caagggctgg cacaacttca gtatcaaact gatgacagaa 2100

ctgatccccg agctgtatga gaccagcgaa gagcagatga ccatcctgac cagactggga 2160

aagcaaaaga caaaggctag aagcaagcgc acaaagtaca tcgacgagaa ggagctgacc 2220

gacgagatct acaaccccgt ggtggccaag agcgtgagac aggccattaa gatcatcaac 2280

ctggccacca agaagtacgg cgtgttcgac aacatcgtga tcgagatggc cagagagaac 2340

aacgaggagg atgccaagaa agattacgtg aaaagacaaa aagctaatga ggacgaaaag 2400

aacgccgcta tggaaaaggc tgcccaccag tacaacggca agaaggagct gcccgataac 2460

gtgtttcacg gccacaagga actggccaca aagatcagac tgtggcacca gcagggcgag 2520

aagtgcctgt acaccggcaa aaacatccct atctctgatc tgatccacaa ccagtataag 2580

tacgagatcg accacatcct gcctctgtca ctgagcttcg acgacagcct ggccaataag 2640

gtgctggtgc tcgctaccgc caaccaggag aagggccaaa gaacaccttt ccaggccctc 2700

gacagcatgg acgatgcgtg gtcctataga gaatttaagg cctacgtgcg gggcgccaga 2760

gccctgagca acaagaaaaa agattacctg ctgaatgaag aggacatcaa caagatcgaa 2820

gtgaagcaga aattcatcga gaggaacctt gtggacactc ggtactcctc tagagtggtc 2880

ctgaacgccc tgcaggactt ctacaagctg aatgatttcg acaccaagat cagcgtggtg 2940

agaggccagt tcaccagcca gctgagacgg aaatggagaa tcgacaagag cagagaaacc 3000

taccaccacc acgccgtgga cgctctgatc attgccgcta gctcgcagct gagactgtgg 3060

aagaagcagg gcaacccact gatcagctac aaggaaaacc agttcgtcga ctccgaaacc 3120

ggagaaatta tcagcctcac agatgatgaa tacaaggaac tggtgttccg ggctccatac 3180

gaccacttcg tggacacagt gagcagcaaa aagtttgaag acagaatcct tttctcctac 3240

caggtggatt ccaaatacaa ccggaaaatc agcgacgcca ccatttactc taccagaaag 3300

gccaagctgg gcaaagacaa gagcgaggaa acctacgtgc tgggcaagat aaaggacatc 3360

tacacccaga ccggctacga tgccttcatc aagctgtaca agaaggacaa gtccaaattt 3420

ctgatgtacc acaaggatcc tatcaccttt gagaaggtga tcgaggaaat cctgaagacc 3480

taccccgaca aggaaatcaa cgagaagggc aaggaagtgg catgcaaccc ttttgaaaaa 3540

tatagacagg agaatggacc tctgagaaag tattctaaga aaggtaaggg ccctgagatc 3600

aagagcctga agtactacga caacaaactc ggcaaccaca tcgacataac ccctgacaac 3660

agcgaaaatc aggtgatcct ccagtccctg aaaccttggc ggaccgacgt gtacttcaac 3720

cacaaaacca agatttatga gctgatgggc ctgaagtaca gcgacctgag cttcgagaag 3780

ggcagcggca agtaccggat tagcctggac aaatataacg tgatcaagaa aaaggagggc 3840

gtgcacaagg aaagcgagtt caagttcaca ctgtacaaga acgacctgat cctaatcaag 3900

gatctggaaa agagcgagca gcagctgttt agatacaaca gccggaacga tacatccaag 3960

cactacgtgg agctgaagcc ttacgacaag gccaaattcg agggaaatca acctctgatg 4020

gccctgttcg gcaatgtggc caagggaggc cagtgcctga agggcctgaa caaagccaac 4080

atcagcatct acaaggtgca gaccgacgtg ctgggcaaca agcggttcat caagaaagaa 4140

ggcgacgctc ctaagctgga atttagcggc gggagcggcg ggagcggggg gagcactaat 4200

ctgagcgaca tcattgagaa ggagactggg aaacagctgg tcattcagga gtccatcctg 4260

atgctgcctg aggaggtgga ggaagtgatc ggcaacaagc cagagtctga catcctggtg 4320

cacaccgcct acgacgagtc cacagatgag aatgtgatgc tgctgacctc tgacgccccc 4380

gagtataagc cttgggccct ggtcatccag gattctaacg gcgagaataa gatcaagatg 4440

ctgagcggag gatccggagg atctggaggc agcaccaacc tgtctgacat catcgagaag 4500

gagacaggca agcagctggt catccaggag agcatcctga tgctgcccga agaagtcgaa 4560

gaagtgatcg gaaacaagcc tgagagcgat atcctggtcc ataccgccta cgacgagagt 4620

accgacgaaa atgtgatgct gctgacatcc gacgccccag agtataagcc ctgggctctg 4680

gtcatccagg attccaacgg agagaacaaa atcaaaatgc tgtctggcgg ctcaaaaaga 4740

accgccgacg gcagcgaatt cgagcccaag aagaagagga aagtc 4785

<210> 7

<211> 4937

<212> DNA

<213> 人工序列

<400> 7

gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60

ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120

aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180

atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240

cgaaacaccg tgagaccgag agagggtctc agtttttgta ctctcaagaa attgcagaag 300

ctacaaagat aaggcttcat gccgaaatca acaccctgtc tcttggcggg gtgttttttt 360

ttttaaagaa ttctcgacct cgagacaaat ggcagtattc atccacaatt ttaaaagaaa 420

aggggggatt ggggggtaca gtgcagggga aagaatagta gacataatag caacagacat 480

acaaactaaa gaattacaaa aacaaattac aaaaattcaa aattttcggg tttattacag 540

ggacagcaga gatccacttt ggccgcggct cgagggggtt ggggttgcgc cttttccaag 600

gcagccctgg gtttgcgcag ggacgcggct gctctgggcg tggttccggg aaacgcagcg 660

gcgccgaccc tgggactcgc acattcttca cgtccgttcg cagcgtcacc cggatcttcg 720

ccgctaccct tgtgggcccc ccggcgacgc ttcctgctcc gcccctaagt cgggaaggtt 780

ccttgcggtt cgcggcgtgc cggacgtgac aaacggaagc cgcacgtctc actagtaccc 840

tcgcagacgg acagcgccag ggagcaatgg cagcgcgccg accgcgatgg gctgtggcca 900

atagcggctg ctcagcaggg cgcgccgaga gcagcggccg ggaaggggcg gtgcgggagg 960

cggggtgtgg ggcggtagtg tgggccctgt tcctgcccgc gcggtgttcc gcattctgca 1020

agcctccgga gcgcacgtcg gcagtcggct ccctcgttga ccgaatcacc gacctctctc 1080

cccaggggga tccatggtga gcaagggcga ggagctgttc accggggtgg tgcccatcct 1140

ggtcgagctg gacggcgacg taaacggcca caagttcagc gtgtccggcg agggcgaggg 1200

cgatgccacc tacggcaagc tgaccctgaa gttcatctgc accaccggca agctgcccgt 1260

gccctggccc accctcgtga ccaccctgac ctacggcgtg cagtgcttca gccgctaccc 1320

cgaccacatg aagcagcacg acttcttcaa gtccgccatg cccgaaggct acgtccagga 1380

gcgcaccatc ttcttcaagg acgacggcaa ctacaagacc cgcgccgagg tgaagttcga 1440

gggcgacacc ctggtgaacc gcatcgagct gaagggcatc gacttcaagg aggacggcaa 1500

catcctgggg cacaagctgg agtacaacta caacagccac aacgtctata tcatggccga 1560

caagcagaag aacggcatca aggtgaactt caagatccgc cacaacatcg aggacggcag 1620

cgtgcagctc gccgaccact accagcagaa cacccccatc ggcgacggcc ccgtgctgct 1680

gcccgacaac cactacctga gcacccagtc cgccctgagc aaagacccca acgagaagcg 1740

cgatcacatg gtcctgctgg agttcgtgac cgccgccggg atcactctcg gcatggacga 1800

gctgtacaag taaagcggcc gcgactctag atcataatca gccataccac atttgtagag 1860

gttttacttg ctttaaaaaa cctcccacac ctccccctga acctgaaaca taaaatgaat 1920

gcaattgttg ttgttaactt gtttattgca gcttataatg gttacaaata aagcaatagc 1980

atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa 2040

ctcatcaatg tatcttagtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt 2100

ccggtgggcg cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca 2160

actcgtagga caggtgccgg cagcgctctt ccgcttcctc gctcactgac tcgctgcgct 2220

cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 2280

cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 2340

accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 2400

acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 2460

cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 2520

acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 2580

atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 2640

agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 2700

acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 2760

gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 2820

gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 2880

gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 2940

gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 3000

acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 3060

tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 3120

ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt 3180

catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat 3240

ctggccccag tgctgcaatg ataccgcggg acccacgctc accggctcca gatttatcag 3300

caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct 3360

ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt 3420

tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg 3480

cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca 3540

aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt 3600

tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat 3660

gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac 3720

cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa 3780

aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt 3840

tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt 3900

tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa 3960

gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt 4020

atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 4080

taggggttcc gcgcacattt ccccgaaaag tgccacctga cgcgccctgt agcggcgcat 4140

taagcgcggc gggtgtggtg gttacgcgca gcgtgaccgc tacacttgcc agcgccctag 4200

cgcccgctcc tttcgctttc ttcccttcct ttctcgccac gttcgccggc tttccccgtc 4260

aagctctaaa tcgggggctc cctttagggt tccgatttag tgctttacgg cacctcgacc 4320

ccaaaaaact tgattagggt gatggttcac gtagtgggcc atcgccctga tagacggttt 4380

ttcgcccttt gacgttggag tccacgttct ttaatagtgg actcttgttc caaactggaa 4440

caacactcaa ccctatctcg gtctattctt ttgatttata agggattttg ccgatttcgg 4500

cctattggtt aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt aacaaaatat 4560

taacgcttac aatttgccat tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg 4620

tgcgggcctc ttcgctatta cgccagccca agctaccatg ataagtaagt aatattaagg 4680

tacgggaggt acttggagcg gccgcaataa aatatcttta ttttcattac atctgtgtgt 4740

tggttttttg tgtgaatcga tagtactaac atacgctctc catcaaaaca aaacgaaaca 4800

aaacaaacta gcaaaatagg ctgtccccag tgcaagtgca ggtgccagaa catttctcta 4860

tcgataggta ccgattagtg aacggatctc gacggtatcg atcacgagac tagcctcgag 4920

cggccgcccc cttcacc 4937

<210> 8

<211> 86

<212> DNA

<213> 人工序列

<400> 8

gtttttgtac tctcaagaaa ttgcagaagc tacaaagata aggcttcatg ccgaaatcaa 60

caccctgtct cttggcgggg tgtttt 86

<210> 9

<211> 7

<212> PRT

<213> 人工序列

<400> 9

Pro Lys Lys Lys Arg Lys Val

1 5

<210> 10

<211> 3743

<212> DNA

<213> 人工序列

<400> 10

agcggaggat cctctggcag cgagacacca ggaacaagcg agtcagcaac accagagagc 60

agtggcggca gcagcggcgg cagcaacggc aagatcctgg gactggccat cggagttgca 120

tctgttggag tgggcatcct ggacaagaag accggcgaga tcatccacgc cagcagcaga 180

atcttccccg ccgccacagc cgatagcaac gtggaacgga ggggcttcag acagggaaga 240

cggctgggcc gtagaaaaaa acacagaaag gtgcggttgg ccgatctgtt cagcgacacc 300

ggcctgataa cagacttctc taaagtgtct atcaacctga acccctacga gctgcggatc 360

aagggcctca atgagaaact gacaaacgag gaactgttca tcgccctgaa gaacatcgtg 420

aagagaagag gcatcagcta cctggatgac gccaatgagg acggcgagag ctcctctagc 480

gagtacggca aggctgtgga agaaaaccga aagttgctgg ccgacaagac tcctggccag 540

atccagctgg aacgcttcga aaagtacgga caggtccgag gagatttcac catcgaggaa 600

aacggcgaaa agcatagact gctgaacgtg ttcagcacca gcgcctataa gaaagaagcc 660

gagcggattc tgaccaagca gcaagattac aaccaagaca tcaccgacga gttcatccag 720

gcctacctga caatcctgac gggaaagaga aagtactacc atggccccgg caacgagaag 780

tctagaaccg actacggccg gttcaggacc gatggcacca ccctggacaa catctttggc 840

atcctgatcg gcaaatgtac attctaccca gaggagtacc gggcggccaa ggcctcttac 900

accgcccagg agtttaacct cctgaatgac ctgaacaatc tgacagttcc aaccgagaca 960

aagaaactga gcgaggaaca gaagcggcaa atcatcgagt acgccaaggg agccaagaca 1020

cttggagccg ccaccctgct caagtacatc gccaagctgg tggacggctc tgtggaggat 1080

atcaagggct atagaattga taaaagcgag aaacctgaga tgcacacatt cgatatctac 1140

agaaagatgc agacactgga aaccgtggat gtggaaaagc tgtcacgcga ggtgctggat 1200

gagctggccc atatcctgac actgaatacc gagagagaag gtatcgagga ggccatcaag 1260

gtcagcttta tcaagagaga gttcgaacag gaccagatcg ccgagctggt cagcttccgg 1320

aagtccaact ctagcctgtt tggcaagggc tggcacaact tcagtatcaa actgatgaca 1380

gaactgatcc ccgagctgta tgagaccagc gaagagcaga tgaccatcct gaccagactg 1440

ggaaagcaaa agacaaaggc tagaagcaag cgcacaaagt acatcgacga gaaggagctg 1500

accgacgaga tctacaaccc cgtggtggcc aagagcgtga gacaggccat taagatcatc 1560

aacctggcca ccaagaagta cggcgtgttc gacaacatcg tgatcgagat ggccagagag 1620

aacaacgagg aggatgccaa gaaagattac gtgaaaagac aaaaagctaa tgaggacgaa 1680

aagaacgccg ctatggaaaa ggctgcccac cagtacaacg gcaagaagga gctgcccgat 1740

aacgtgtttc acggccacaa ggaactggcc acaaagatca gactgtggca ccagcagggc 1800

gagaagtgcc tgtacaccgg caaaaacatc cctatctctg atctgatcca caaccagtat 1860

aagtacgaga tcgaccacat cctgcctctg tcactgagct tcgacgacag cctggccaat 1920

aaggtgctgg tgctcgctac cgccaaccag gagaagggcc aaagaacacc tttccaggcc 1980

ctcgacagca tggacgatgc gtggtcctat agagaattta aggcctacgt gcggggcgcc 2040

agagccctga gcaacaagaa aaaagattac ctgctgaatg aagaggacat caacaagatc 2100

gaagtgaagc agaaattcat cgagaggaac cttgtggaca ctcggtactc ctctagagtg 2160

gtcctgaacg ccctgcagga cttctacaag ctgaatgatt tcgacaccaa gatcagcgtg 2220

gtgagaggcc agttcaccag ccagctgaga cggaaatgga gaatcgacaa gagcagagaa 2280

acctaccacc accacgccgt ggacgctctg atcattgccg ctagctcgca gctgagactg 2340

tggaagaagc agggcaaccc actgatcagc tacaaggaaa accagttcgt cgactccgaa 2400

accggagaaa ttatcagcct cacagatgat gaatacaagg aactggtgtt ccgggctcca 2460

tacgaccact tcgtggacac agtgagcagc aaaaagtttg aagacagaat ccttttctcc 2520

taccaggtgg attccaaata caaccggaaa atcagcgacg ccaccattta ctctaccaga 2580

aaggccaagc tgggcaaaga caagagcgag gaaacctacg tgctgggcaa gataaaggac 2640

atctacaccc agaccggcta cgatgccttc atcaagctgt acaagaagga caagtccaaa 2700

tttctgatgt accacaagga tcctatcacc tttgagaagg tgatcgagga aatcctgaag 2760

acctaccccg acaaggaaat caacgagaag ggcaaggaag tggcatgcaa cccttttgaa 2820

aaatatagac aggagaatgg acctctgaga aagtattcta agaaaggtaa gggccctgag 2880

atcaagagcc tgaagtacta cgacaacaaa ctcggcaacc acatcgacat aacccctgac 2940

aacagcgaaa atcaggtgat cctccagtcc ctgaaacctt ggcggaccga cgtgtacttc 3000

aaccacaaaa ccaagattta tgagctgatg ggcctgaagt acagcgacct gagcttcgag 3060

aagggcagcg gcaagtaccg gattagcctg gacaaatata acgtgatcaa gaaaaaggag 3120

ggcgtgcaca aggaaagcga gttcaagttc acactgtaca agaacgacct gatcctaatc 3180

aaggatctgg aaaagagcga gcagcagctg tttagataca acagccggaa cgatacatcc 3240

aagcactacg tggagctgaa gccttacgac aaggccaaat tcgagggaaa tcaacctctg 3300

atggccctgt tcggcaatgt ggccaaggga ggccagtgcc tgaagggcct gaacaaagcc 3360

aacatcagca tctacaaggt gcagaccgac gtgctgggca acaagcggtt catcaagaaa 3420

gaaggcgacg ctcctaagct ggaatttagc ggcgggagcg gcgggagcgg ggggagcact 3480

aatctgagcg acatcattga gaaggagact gggaaacagc tggtcattca ggagtccatc 3540

ctgatgctgc ctgaggaggt ggaggaagtg atcggcaaca agccagagtc tgacatcctg 3600

gtgcacaccg cctacgacga gtccacagat gagaatgtga tgctgctgac ctctgacgcc 3660

cccgagtata agccttgggc cctggtcatc caggattcta acggcgagaa taagatcaag 3720

atgctgagcg gaggatccgg agg 3743

<210> 11

<211> 30

<212> DNA

<213> 人工序列

<400> 11

agcggaggat cctctggcag cgagacacca 30

<210> 12

<211> 33

<212> DNA

<213> 人工序列

<400> 12

cctccggatc ctccgctcag catcttgatc tta 33

<210> 13

<211> 24

<212> DNA

<213> 人工序列

<400> 13

accgtgggca agagtttctg ccac 24

<210> 14

<211> 24

<212> DNA

<213> 人工序列

<400> 14

aaacgtggca gaaactcttg ccca 24

<210> 15

<211> 24

<212> DNA

<213> 人工序列

<400> 15

accgctgcgt tcctagaacc acag 24

<210> 16

<211> 24

<212> DNA

<213> 人工序列

<400> 16

aaacctgtgg ttctaggaac gcag 24

<210> 17

<211> 24

<212> DNA

<213> 人工序列

<400> 17

accgaatgct ggctacagat gtcc 24

<210> 18

<211> 24

<212> DNA

<213> 人工序列

<400> 18

aaacggacat ctgtagccag catt 24

<210> 19

<211> 24

<212> DNA

<213> 人工序列

<400> 19

accgctcata tgtcacttac ctct 24

<210> 20

<211> 24

<212> DNA

<213> 人工序列

<400> 20

aaacagaggt aagtgacata tgag 24

<210> 21

<211> 24

<212> DNA

<213> 人工序列

<400> 21

accggagaca ggatctcact gtgt 24

<210> 22

<211> 24

<212> DNA

<213> 人工序列

<400> 22

aaacacacag tgagatcctg tctc 24

<210> 23

<211> 24

<212> DNA

<213> 人工序列

<400> 23

accgtgctct aggtggtgtt aatg 24

<210> 24

<211> 24

<212> DNA

<213> 人工序列

<400> 24

aaaccattaa caccacctag agca 24

<210> 25

<211> 24

<212> DNA

<213> 人工序列

<400> 25

accgcagcaa catgaacaac tgaa 24

<210> 26

<211> 24

<212> DNA

<213> 人工序列

<400> 26

aaacttcagt tgttcatgtt gctg 24

<210> 27

<211> 24

<212> DNA

<213> 人工序列

<400> 27

accgaagagc caagtcttac tgta 24

<210> 28

<211> 24

<212> DNA

<213> 人工序列

<400> 28

aaactacagt aagacttggc tctt 24

<210> 29

<211> 24

<212> DNA

<213> 人工序列

<400> 29

accgctgaca agtactagct tatg 24

<210> 30

<211> 24

<212> DNA

<213> 人工序列

<400> 30

aaaccataag ctagtacttg tcag 24

<210> 31

<211> 24

<212> DNA

<213> 人工序列

<400> 31

accgttcctc atagcaacat cact 24

<210> 32

<211> 24

<212> DNA

<213> 人工序列

<400> 32

aaacagtgat gttgctatga ggaa 24

<210> 33

<211> 19

<212> DNA

<213> 人工序列

<400> 33

ctgacctggc agataccac 19

<210> 34

<211> 20

<212> DNA

<213> 人工序列

<400> 34

ccacaggact taggaacgac 20

<210> 35

<211> 23

<212> DNA

<213> 人工序列

<400> 35

cccttgaaaa gtgcagtgtg tcg 23

<210> 36

<211> 23

<212> DNA

<213> 人工序列

<400> 36

ggcaattccc tttgaaagac tgc 23

<210> 37

<211> 21

<212> DNA

<213> 人工序列

<400> 37

ccgaggtact gttgctgctt c 21

<210> 38

<211> 22

<212> DNA

<213> 人工序列

<400> 38

gagatggcaa gcctttgttg cg 22

<210> 39

<211> 22

<212> DNA

<213> 人工序列

<400> 39

gatgctcatt ggtagctcgt gc 22

<210> 40

<211> 25

<212> DNA

<213> 人工序列

<400> 40

ctatctgtcc atccatgcat ttgcc 25

<210> 41

<211> 20

<212> DNA

<213> 人工序列

<400> 41

cctactgcgg atgccttctt 20

<210> 42

<211> 21

<212> DNA

<213> 人工序列

<400> 42

ttagcttggt gtggcagcat g 21

<210> 43

<211> 25

<212> DNA

<213> 人工序列

<400> 43

caagtcattg tgatgactga ggagc 25

<210> 44

<211> 19

<212> DNA

<213> 人工序列

<400> 44

ggccagccta tgatgggcc 19

<210> 45

<211> 25

<212> DNA

<213> 人工序列

<400> 45

ggatgctgtg atgactgaga cgtag 25

<210> 46

<211> 28

<212> DNA

<213> 人工序列

<400> 46

tggacatttt gagtttgaaa aggctgtg 28

<210> 47

<211> 24

<212> DNA

<213> 人工序列

<400> 47

caggcgtgct gtaatacatg aacc 24

<210> 48

<211> 26

<212> DNA

<213> 人工序列

<400> 48

gtcaccatag gataggaagt cagcag 26

<210> 49

<211> 18

<212> DNA

<213> 人工序列

<400> 49

gtcccactgc accagcag 18

<210> 50

<211> 32

<212> DNA

<213> 人工序列

<400> 50

cctattctat ctgagggagg acatgattga ag 32

<210> 51

<211> 26

<212> DNA

<213> 人工序列

<400> 51

ctctgcctgg aagaataatg agaacc 26

<210> 52

<211> 23

<212> DNA

<213> 人工序列

<400> 52

ccaggatggt gtttgtgaga tgg 23

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:一种制备人肾脏组织单细胞悬液的消化酶及应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!