Polymerase compositions and methods of making and using the same
1. A composition comprising an isolated polypeptide or biologically active fragment thereof, said polypeptide having at least 95% identity to SEQ ID NO:1, wherein said isolated polypeptide or biologically active fragment thereof exhibits polymerase activity and exhibits an improvement in one or more properties selected from thermal stability and/or sequencing properties when used in an emulsion PCR template amplification step of a sequencing workflow relative to SEQ ID NO:1 and/or SEQ ID NO:34 reference polymerase, said sequencing properties selected from read length, accuracy, strand bias, systematic error and total sequencing throughput, and wherein said isolated polypeptide comprises one or more amino acid substitutions selected from the group consisting of: L763F, L763F + E805I, or L763F + E397V + E745T.
2. The composition of claim 1, wherein the isolated polypeptide or biologically active fragment thereof has improved thermostability at 95 ℃ for 6 minutes compared to the thermostability of SEQ ID NO 1 at 95 ℃ for 6 minutes.
3. The composition of claim 2, wherein the isolated polypeptide or biologically active fragment thereof comprises L763F.
4. The composition of claim 3, wherein the isolated polypeptide or biologically active fragment thereof comprises L763F + E805I.
5. The composition of claim 3, wherein the sequencing properties are analyzed using an emulsion PCR template amplification reaction comprising 125mM KCl to generate a nucleic acid template, and wherein the nucleic acid sequence of the nucleic acid template is analyzed using a high throughput sequencing system.
6. The composition of claim 5, wherein the one or more properties exhibited by the isolated polypeptide or biologically active fragment thereof comprises at least two sequencing properties selected from the group consisting of: increased AQ20 mean read length reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage, all relative to a reference polymerase having the following sequence: 34 and/or 1.
7. The composition of claim 1, wherein the isolated polypeptide or biologically active fragment thereof further comprises P6N and/or E295F and/or E794C.
8. The composition of claim 3, wherein the one or more properties are analyzed by performing an emulsion PCR template amplification reaction on a library consisting of templates with a GC content of 65%.
9. The composition of claim 1, wherein the isolated polypeptide or biologically active fragment thereof comprises a77E, a97V, K240I, L287T, or K292C relative to SEQ ID No. 1, and wherein the one or more properties comprise sequencing properties analyzed using a high throughput nucleic acid sequencing reaction, wherein the polypeptide or biologically active fragment thereof is used to perform an emulsion PCR template amplification reaction on a library comprised of templates having a GC content of 65%.
10. A composition according to claim 1, wherein the isolated polypeptide or biologically active fragment thereof includes L763F + E397V + E745.
11. A method for amplifying a nucleic acid comprising contacting the nucleic acid with a modified polymerase or a biologically active fragment thereof under conditions suitable for amplifying the nucleic acid, and amplifying the nucleic acid, wherein the modified polymerase or biologically active fragment thereof is as defined in any one of claims 1 to 10.
12. The method of claim 11, wherein the suitable conditions comprise conditions suitable for performing a polymerase chain reaction, an emulsion polymerase chain reaction, an isothermal amplification reaction, a recombinase polymerase amplification reaction, a proximity ligation amplification, a rolling circle amplification, or a strand displacement amplification.
13. The method of claim 11, wherein the amplification is clonally amplifying the nucleic acid in solution or on a solid support.
14. A method for performing a polymerization reaction comprising contacting a modified polymerase having at least 95% sequence identity to SEQ ID No. 1, or a biologically active fragment thereof, with a nucleic acid template in the presence of one or more nucleotide triphosphates under conditions suitable for a polymerization reaction, wherein the modified polymerase or the biologically active fragment thereof is as defined in any one of claims 1 to 10.
15. A kit comprising two or more containers, wherein one container comprises components for performing a nucleic acid polymerization reaction and the other container comprises a modified polymerase or biologically active fragment thereof as defined in any one of claims 1 to 10, optionally wherein the kit comprises nucleotide triphosphates, MgCl, and2and a buffer for nucleic acid polymerization reactions, and/or components for forming emulsions, and/or nucleic acid capture beads.
Background
The ability of enzymes to catalyze biological reactions is essential to life. A range of biological applications use enzymes to synthesize various biomolecules in vitro. One particularly useful class of enzymes is polymerases, which can catalyze the polymerization of biomolecules (e.g., nucleotides or amino acids) into biopolymers (e.g., nucleic acids or peptides). For example, polymerases that can polymerize nucleotides into nucleic acids, particularly in a template-dependent manner, are useful in recombinant DNA technology and nucleic acid detection and nucleic acid sequencing applications. Many nucleic acid sequencing methods monitor nucleotide incorporation during in vitro template-dependent nucleic acid synthesis catalyzed by a polymerase. Single Molecule Sequencing (SMS) and double-ended sequencing (PES) typically include polymerases for template-dependent nucleic acid synthesis. Polymerases are also suitable for generating nucleic acid libraries, such as those generated during emulsion PCR or bridge PCR. Nucleic acid libraries produced using such polymerases can be used in a variety of downstream processes, such as genotyping; nucleotide Polymorphism (SNP) analysis; analyzing copy number variation; epigenetic analysis; analyzing gene expression; a hybridization array; gene mutation analysis, including (but not limited to) detection, prognosis and/or diagnosis of a disease condition; detection and analysis of rare or low frequency allelic mutations; and nucleic acid sequencing, including (but not limited to) de novo sequencing or target re-sequencing.
A desirable quality of polymerases suitable for nucleic acid amplification, synthesis, and/or detection is improved nucleotide incorporation compared to a reference polymerase. Improved nucleotide incorporation can make processes such as nucleic acid library preparation and/or DNA sequencing more cost-effective by reducing the number of nucleic acid templates necessary to sequence a desired target molecule. In another aspect, improved nucleotide incorporation compared to a reference polymerase can also reduce the number of sequencing reads required to determine a desired target molecule sequence. In addition, improved nucleotide incorporation (compared to a reference polymerase) can also improve signal uniformity, leading to increased accuracy in the base determination of the desired target molecule. In yet another aspect, improved nucleotide incorporation by the modified polymerase compared to a reference polymerase can increase the read length of a desired target molecule and thus reduce the likelihood that the modified polymerase will stall or dissociate from the desired target molecule. In yet another aspect, a modified polymerase having improved templating or clonal amplification efficiency compared to a reference polymerase and thus can improve downstream sequencing of target molecules that are conventionally considered "difficult" target molecules, such as target molecules with high GC or AT content. Thus, it is an aspect of the present invention to provide a method, system, apparatus and composition of matter for improving GC and AT bias in nucleic acid amplification using a modified polymerase with reduced GC or AT content bias.
Another desirable quality of enzymes used in nucleic acid library preparation or DNA sequencing is thermostability. DNA polymerases exhibiting thermostability have revolutionized many aspects of molecular biology and clinical diagnostics due to the development of the Polymerase Chain Reaction (PCR) that uses cycles of thermal denaturation, primer adhesion, and enzymatic primer extension to amplify DNA templates. The prototype thermostable DNA polymerase used in the initial PCR experiment was Taq DNA polymerase, which was originally isolated from Thermus thermophilus Thermus aquaticus (Thermus aquaticus).
There are three major families of DNA polymerases, referred to as families A, B and C. The classification of polymerases into one of these three families is based on the structural similarity of a given polymerase to e.coli (e.coli) DNA polymerase I (family a), II (family B), or III (family C). For example, family a DNA polymerases include, but are not limited to, Klenow DNA polymerase, thermus aquaticus DNA polymerase I (Taq polymerase), and bacteriophage T7 DNA polymerase; family B DNA polymerases previously known as alpha-family polymerases (Braithwaite and Ito,1991, nucleic acid research (Nuc. acids Res.) 19:4045) include, but are not limited to, human alpha, delta and epsilon DNA polymerases, T4, RB69 and Bacteriophage DNA polymerase, and Pyrococcus furiosus (pyrrococcus furiosus) DNA polymerase (Pfu polymerase); and family C DNA polymerases include, but are not limited to, Bacillus subtilis DNA polymerase III and E.coli DNA polymerase III α and ε subunits (listed as products of the dnaE and dnaQ genes, respectively, Braithwaite and Ito,1993, "Nucleic Acids research (Nucleic Acids Res.)" 21: 787). DNA polymerase protein sequence alignments across each family of archaea, bacteria, viruses, and a broad spectrum of eukaryotic organisms are presented in Braithwaite and Ito (1993, supra), which are incorporated herein by reference in their entirety.
When performing polymerase-dependent nucleic acid synthesis or amplification, it may be suitable to modify the polymerase (e.g. via mutation or chemical modification) to alter its catalytic properties. In some cases, it may be useful to modify the polymerase to enhance its catalytic properties. In some embodiments, it may be useful to enhance the catalytic properties of the polymerase via site-directed amino acid substitutions or deletions. In some embodiments, it may be suitable for enhancing the catalytic properties of a polymerase via site saturation mutagenesis of one, more or each amino acid of the polymerase. In some embodiments, modifications of the polymerase can be made to enhance catalytic properties of the modified polymerase, such as read length, accuracy, and/or processivity.
The performance of polymerases in various biological assays involving nucleic acid synthesis or detection may be limited by the behavior of the polymerase on nucleotide substrates, salt concentrations, or thermostable conditions. For example, polymerase activity assays may be complicated by the following undesirable behaviors: as dissociation of a given polymerase from the template; bind and/or incorporate incorrect (e.g., non-Watson-Crick (Watson-Crick) base-pairing) nucleotides; or a tendency to release the correct (e.g., watson-crick base pairing) nucleotide rather than incorporation. In addition, polymerase activity assays may be complicated by undesirable behavior in which the target molecule is not sufficiently denatured: such as in the AT and GC rich region or premature weakening of the target molecule. As demonstrated herein, desirable polymerase properties for improved nucleic acid amplification can be achieved via appropriate selection, engineering, and/or modification of the selected polymerase. For example, such modifications can be made to favorably alter the affinity of polymerase binding to template, processivity, accuracy of nucleotide incorporation, strand bias, and coverage. Such changes within the polymerase may also increase the amount of sequence information and/or the quality of sequencing information obtained directly or downstream from the amplification workflow improved by the use of such modified polymerases.
There remains a need in the art for improved polymerase compositions (and related methods, systems, devices, and kits) that exhibit altered properties, such as increased processivity, increased read length (including error-free read length), increased accuracy and/or affinity for DNA templates, increased coverage, reduced strand bias, and/or reduced systematic errors. Such polymerase compositions (and related methods, systems, devices, and kits) may be suitable for use in a variety of assays involving polymerase-dependent nucleic acid synthesis, including nucleic acid sequencing and/or nucleic acid library generation, such as nucleic acid libraries prepared by bridge PCR or clonal amplification.
Disclosure of Invention
The present invention provides in certain embodiments a composition comprising an isolated polypeptide having at least 50, 75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800 consecutive amino acid residues having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, and 99% identity to the following sequence: 1 or 34, or a biologically active fragment thereof, wherein the polypeptide exhibits polymerase activity. In exemplary embodiments, the isolated polypeptide exhibits an improvement in one or more characteristics selected from thermostability and/or sequencing characteristics selected from read length, accuracy, strand bias, system error, and total sequencing throughput relative to a reference polymerase of SEQ ID NO:1 and/or SEQ ID NO: 34. In certain embodiments, the isolated polypeptide comprises one or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A. In certain embodiments, the sequencing properties are determined by using the isolated polypeptide in an emulsion PCR template amplification reaction in the presence of 125mM KCl or 125mM NaCl in certain illustrative embodiments during sample preparation for the nucleic acid sequencing reaction. In certain embodiments, as exemplified herein, the sequencing characteristics are analyzed using a next generation (i.e., massively parallel, high throughput) sequencing workflow, such as an Ion Torrent (Life Technologies, Carlsbad, CA) sequencing workflow. In certain aspects, the isolated polypeptides and modified polymerases used in the method embodiments provided herein have improved thermostability at 95 ℃, 96 ℃, or 97 ℃ for 2 minutes, 4 minutes, and in illustrative examples 6 minutes, as compared to the thermostability of SEQ ID NO:1 at 95 ℃ for the same time period and temperature. In an illustrative example, thermal stability can be tested by: the polymerase tested and the control polymerase were incubated for 2 min, 4 min and in the illustrative example 6 min in the same conditions including high temperature (e.g., 95 ℃, 96 ℃ or 97 ℃) in incubation buffer including, for example, 15mM Tris (pH 7.5), 100mM KCl, 30% trehalose (Trahalose), 0.1% NP40 and 50mM polymerase. After incubation at elevated temperature, the solution can optionally be placed on ice and then transferred to an enzyme reaction mixture comprising 15mM Tris (pH 7.5), 100mM KCl, 8mM MgCl2, 150nM oligomer (Oligo)221 and 5nM polymerase reaction mixture (10. mu.l) from the heat treatment step the oligomer 221 is a hairpin oligomer with attached fluorescent dye (TTTTTGCAGGTGACAGTTTTTCCTGTCACCTGGC (SEQ ID NO:50) where X is a fluorescein-dT residue. Upon addition of dATP, oligo 221 extends, causing fluorescence to be released. Thus, as a non-limiting example, thermal stability can be tested using the method provided in example 10, as outlined in fig. 14) herein. In certain illustrative embodiments, the isolated polypeptide has improved thermostability at 95 ℃ for 6 minutes compared to the thermostability of SEQ ID NO:1 at 95 ℃ for 6 minutes. In certain illustrative embodiments of these aspects, the thermostable isolated polypeptide or biologically active fragment thereof comprises G418C or E397V. In still further embodiments, in addition to G418C or in particular aspects to the E397V mutation, the isolated peptide further comprises one or more amino acid substitutions selected from the group consisting of E745T, L763F and E805I, wherein numbering is relative to SEQ ID NO: 1. In certain aspects, the compositions include agents, such as oligonucleotides and/or aptamers, for hot-start activation mechanisms. In other aspects, the isolated polypeptide is chemically modified to provide a hot start mechanism.
In one embodiment of the invention, the one or more properties exhibited by the isolated polypeptide or biologically active fragment thereof in the composition include at least two, three, four, five, six or all sequencing workflow properties selected from the group consisting of: increased AQ20 mean read length reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage, all relative to a reference polymerase having the following sequence: 34 and/or 1. In some embodiments, the isolated polypeptide or biologically active fragment thereof, wherein one mutation is E397V and the other mutation is P6N, E745T, and/or L763F. In another embodiment that may or may not include E397V, the mutations include L763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.
In another embodiment of the invention, one or more properties of the composition are exhibited when, or are analyzed or tested by, performing an emulsion PCR template amplification reaction on a library comprised of templates having a GC content of 65%. In certain embodiments, the reference polymerase is SEQ ID NO:34, and in certain particular illustrative embodiments, the reference polymerase is SEQ ID NO: 1.
In one embodiment of the composition of the invention, the isolated polypeptide or biologically active fragment thereof comprises a mutation selected from a group consisting of a77E, a97V, K240I, L287T, or K292C relative to a reference polymerase having the sequence of SEQ ID No. 1, and in an exemplary aspect of this embodiment, the one or more properties comprise a sequencing property analyzed using a high throughput nucleic acid sequencing reaction, wherein the polypeptide or biologically active fragment thereof is used to perform an emulsion PCR template amplification reaction on a library comprised of templates having a GC content of 65%.
In another embodiment, the isolated polypeptides of the composition include SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33. It is understood that in illustrative embodiments of the invention, including compositions and method embodiments comprising an isolated polypeptide or a modified polymerase provided herein, one may To analyze the isolated polypeptide or modified polymerase to determine if it has certain properties, activities or characteristics, using an emulsion PCR reaction to amplify the template as part of a sequencing workflow, e.g., amplifying the template on a solid support, in some illustrative embodiments clonally amplifying the template on a solid support. The nucleic acid sequence of at least a portion of the amplified template is then determined. As exemplified herein, this sequencing in the illustrative embodiments is performed using a high throughput sequencing platform, such as ion torrent PGM. The results of this sequencing are compared to the results of similar experiments performed using a reference polymerase, such as Taq polymerase (SEQ ID NO:1) or modified Taq polymerase SEQ ID NO:34, which is used for the emulsion PCR template amplification step in a high throughput sequencing reaction. In one aspect, testing for an isolated polypeptide or mutant polymerase includes amplifying a pool of nucleic acid molecules to a nucleic acid capture support (e.g., Ion Sphere) using emulsion PCR on the polymerase tested and a reference polymeraseTMParticles). In this example, the amplified nucleic acid molecule may then be loaded with PGM TM314 sequencing chip, which can then be loaded with ion torrent PGMTMAnd sequencing in a sequencing system. The sequencing results of the tested polymerase and the reference polymerase may then be compared.
In another embodiment, provided herein is a method (and related kits, devices, systems, and compositions) for amplifying a nucleic acid comprising contacting the nucleic acid with a modified polymerase or biologically active fragment thereof under conditions suitable for amplifying the nucleic acid, and amplifying the nucleic acid, wherein the modified polymerase or biologically active fragment thereof has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:34, exhibits polymerase activity and exhibits an improvement in one or more characteristics selected from thermostability and/or sequencing characteristics selected from read length, accuracy, strand bias, dna polymerase, dna, or dna, Systematic error and total sequencing throughput, and wherein the modified polymerase comprises one or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A. In certain particular embodiments, the sequencing properties are analyzed using an emulsion PCR template amplification reaction, which in particular illustrative embodiments comprises 125mM KCl or 125mM NaCl during sample preparation of the nucleic acid sequencing reaction. In certain embodiments, the sequencing reaction that analyzes the sequencing characteristics of the modified polymerase is part of a next generation (i.e., massively parallel, high-throughput) sequencing workflow (e.g., a workflow used in Ion Torrent systems, Illumina HiSeq or True Seq or X-10 systems). In some embodiments, the sequencing workflow uses ISFET-based sensors. In certain embodiments, as exemplified herein, sequencing characteristics are analyzed using ion torrent (life technologies corporation, carlsbad, california) sequencing workflows and systems. In certain aspects, the modified polymerase used in the methods has improved thermostability at 95 ℃ for 6 minutes compared to the thermostability of SEQ ID No. 1 at 95 ℃ for 6 minutes. In certain illustrative embodiments of these aspects, the thermostable modified polymerase or biologically active fragment thereof used in the methods includes G418C or E397V. In still further embodiments, in addition to G418C or in particular aspects to the E397V mutation, the isolated peptide further comprises one or more amino acid substitutions selected from the group consisting of E745T, L763F and E805I, wherein numbering is relative to SEQ ID NO: 1. In certain aspects, the method comprises hot start, as known in the art of PCR. In these methods of the invention involving hot start, the composition for performing the method may include reagents for the hot start, such as oligonucleotides and/or aptamers, or the modified polymerase may be chemically modified to provide a hot start mechanism.
In one embodiment of the invention, the one or more properties exhibited by the modified polymerase or biologically active fragment thereof used in the method comprise at least two, three, four, five, six or all sequencing workflow properties selected from: increased AQ20 mean read length reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage, all relative to a reference polymerase having the following sequence: 34 and/or 1. In some embodiments, the modified polymerase or biologically active fragment thereof, wherein one mutation is E397V and the other mutation is P6N, E745T, and/or L763F. In another embodiment that may or may not include E397V, the mutations include L763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.
In another embodiment of the invention, one or more properties of the composition are exhibited when an emulsion PCR template amplification reaction is performed on a library comprised of templates having a GC content of 65%, or can be determined by performing the amplification reaction. For the sake of clarity, such steps are not part of the method of the invention, but are instead benchmarks for determining whether a modified polymerase meets the modified polymerase used in the method. In certain embodiments, the reference polymerase used for the polymerase benchmark test is SEQ ID NO:34, and in certain particular illustrative embodiments, the reference polymerase is SEQ ID NO: 1.
In one embodiment of the method of the invention, the modified polymerase or biologically active fragment thereof used in the method comprises a mutation selected from the group consisting of: a77E, a97V, K240I, L287T or K292C. In exemplary aspects of this embodiment, the one or more properties of the modified polypeptide used in the method include sequencing properties analyzed using a next generation (i.e., massively parallel, high-throughput) nucleic acid sequencing reaction, wherein such properties of the modified polymerase or biologically active fragment thereof are tested using an emulsion PCR template amplification reaction on a library consisting of 65% GC content template.
In certain embodiments, the polymerase used in the methods comprises 50, 75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800 consecutive amino acid residues of SEQ ID No. 1 or SEQ ID No. 34, and has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to: 1 or 34, or a biologically active fragment thereof, in certain embodiments, the modified polymerase used in the methods comprises SEQ ID NO 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33.
In some embodiments of the method for amplifying a nucleic acid, the conditions suitable for performing amplification are conditions suitable for performing the following reactions: polymerase chain reaction, isothermal amplification reaction, recombinase polymerase amplification reaction, proximity ligation amplification, rolling circle amplification, strand displacement amplification, or emulsion polymerase chain reaction. Thus, in these embodiments, the method used to amplify the nucleic acid is one of the methods listed above for amplification.
In yet another embodiment, the method for amplifying nucleic acids comprises clonally amplifying nucleic acids in solution or on a solid support. In another embodiment of the method comprises determining the nucleic acid sequence of at least a portion of the nucleic acid. In some embodiments, nucleic acid sequences can be determined using any next generation (i.e., massively parallel, high-throughput) sequencing platform (e.g., ion torrent system, Illumina HiSeq or True Seq or X-10 system). In some embodiments, the nucleic acid sequence may be determined using any ISFET-based sequencing system.
In another embodiment of the method, the nucleic acid comprises AT least 65% GC content or AT least 65% AT content.
Another embodiment of the invention is a method for performing a nucleic acid polymerization reaction comprising contacting a modified polymerase or a biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotide triphosphates under conditions suitable for a polymerization reaction, wherein the modified polymerase or biologically active fragment thereof has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:34, exhibits polymerase activity and exhibits an improvement in one or more properties selected from thermostability and/or sequencing properties selected from read length, accuracy, strand bias, system error and total sequencing throughput relative to SEQ ID NO:1 and/or SEQ ID NO:34 reference polymerase, and wherein the modified polymerase comprises one or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A. To analyze the sequencing properties of the modified polymerase of the methods of the invention, an emulsion PCR template amplification reaction may be used, and in particular embodiments, conditions comprising 125mM KCl or 125mM NaCl are used in sample preparation (a subsequent nucleic acid sequencing reaction). For clarity, the emulsion PCR template amplification reaction and nucleic acid sequence reaction described above are not steps of the method of this embodiment of the invention. Rather, it is part of a method that can be used to determine whether a polymerase is a modified polymerase used in embodiments of the methods described herein.
In certain embodiments, the sequencing workflow to analyze the sequencing characteristics of the modified polymerase is a next generation (i.e., massively parallel, high-throughput) sequencing workflow (e.g., a workflow used in an ion torrent system, Illumina HiSeq or True Seq or X-10 system). In some embodiments, the sequencing workflow uses an ISFET-based sequencing system workflow. In certain embodiments, as exemplified herein, sequencing characteristics are analyzed using ion torrent (life technologies corporation, carlsbad, california) sequencing workflows and systems. In certain aspects, the modified polymerase used in the methods has improved thermostability at 95 ℃ for 6 minutes compared to the thermostability of SEQ ID No. 1 at 95 ℃ for 6 minutes. In certain illustrative embodiments of these aspects, the thermostable modified polymerase or biologically active fragment thereof used in the methods includes G418C or E397V. In still further embodiments, in addition to G418C or in particular aspects to the E397V mutation, the isolated peptide further comprises one or more amino acid substitutions selected from the group consisting of E745T, L763F and E805I, wherein numbering is relative to SEQ ID NO: 1. In certain aspects, the method comprises hot start, as known in the art of PCR. In these methods of the invention involving hot start, the composition for performing the method may include reagents for the hot start, such as oligonucleotides and/or aptamers, or the modified polymerase may be chemically modified to provide a hot start mechanism.
In one embodiment of the invention, the one or more properties exhibited by the modified polymerase or biologically active fragment thereof used in the method comprise at least two, three, four, five, six or all sequencing workflow properties selected from: increased AQ20 mean read length reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage, all relative to a reference polymerase having the following sequence: 34 and/or 1. In some embodiments, the modified polymerase or biologically active fragment thereof, wherein one mutation is E397V and the other mutation is P6N, E745T, and/or L763F. In another embodiment that may or may not include E397V, the mutations include L763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.
In another embodiment of the invention, one or more properties of the composition are exhibited when an emulsion PCR template amplification reaction is performed on a library comprised of templates having a GC content of 65%, or can be determined by performing the amplification reaction. For the sake of clarity, such steps are not part of the method of the invention, but are instead benchmarks for determining whether a modified polymerase meets the modified polymerase used in the method. In certain embodiments, the reference polymerase used for the polymerase benchmark test is SEQ ID NO:34, and in certain particular illustrative embodiments, the reference polymerase is SEQ ID NO: 1.
In one embodiment of the method of the invention, the modified polymerase or biologically active fragment thereof used in the method comprises a mutation selected from the group consisting of: a77E, a97V, K240I, L287T or K292C. In exemplary aspects of this embodiment, the one or more properties of the modified polypeptide used in the method include sequencing properties analyzed using a next generation (high throughput) nucleic acid sequencing reaction, wherein such properties of the modified polymerase or biologically active fragment thereof are tested using an emulsion PCR template amplification reaction on a library consisting of templates with a GC content of 65%.
In certain embodiments, the polymerase used in the methods comprises 50, 75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800 consecutive amino acid residues of SEQ ID No. 1 or SEQ ID No. 34, and has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to: 1 or 34, or a biologically active fragment thereof, in certain embodiments, the modified polymerase used in the methods comprises SEQ ID NO 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33.
In yet another embodiment of the present invention, provided herein is a method for obtaining sequence information from a nucleic acid template, comprising: providing a reaction mixture comprising the nucleic acid template hybridized to a sequencing primer and bound to a modified polymerase or biologically active fragment thereof; contacting the template nucleic acid with at least one type of nucleotide triphosphate, wherein said contacting comprises incorporating one or more nucleotides from at least one type of nucleotide to the 3' end of the sequencing primer and generating an extension primer product; detecting the presence of the extended primer product in the reaction mixture, thereby determining whether nucleotide incorporation has occurred; and identifying at least one of the one or more nucleotides incorporated by at least one type of nucleotide triphosphate, wherein the modified polymerase or biologically active fragment thereof has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% sequence identity to SEQ ID No. 1 or SEQ ID No. 34, exhibits polymerase activity and exhibits an improvement in one or more characteristics selected from thermostability and/or sequencing workflow characteristics relative to SEQ ID No. 1 and/or SEQ ID No. 34 reference polymerase, the sequencing characteristics selected from read length, accuracy, strand bias, system error and total sequencing throughput, and wherein the modified polymerase comprises one or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A. Sequencing workflow characteristics of the modified polymerase or biologically active fragment thereof can be analyzed using an emulsion PCR template amplification reaction, for example, comprising 125mM KCl or 125mM NaCl during sample preparation of a nucleic acid sequencing reaction. In certain embodiments, the method is a next generation sequencing method. In some embodiments, the method uses an ISFET detection system.
In certain aspects, the modified polymerase used in the methods has improved thermostability at 95 ℃ for 6 minutes compared to the thermostability of SEQ ID No. 1 at 95 ℃ for 6 minutes. In certain illustrative embodiments of these aspects, the thermostable modified polymerase or biologically active fragment thereof used in the methods includes G418C or E397V. In still further embodiments, in addition to G418C or in particular aspects to the E397V mutation, the isolated peptide further comprises one or more amino acid substitutions selected from the group consisting of E745T, L763F and E805I, wherein numbering is relative to SEQ ID NO: 1.
In one embodiment of the invention, the one or more properties exhibited by the modified polymerase or biologically active fragment thereof used in the method comprise at least two, three, four, five, six or all sequencing workflow properties selected from: increased AQ20 mean read length reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage, all relative to a reference polymerase having the following sequence: 34 and/or 1. In some embodiments, the modified polymerase or biologically active fragment thereof, wherein one mutation is E397V and the other mutation is P6N, E745T, and/or L763F. In another embodiment that may or may not include E397V, the mutations include L763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.
In one embodiment of the method of the invention, the modified polymerase or biologically active fragment thereof used in the method comprises a mutation selected from the group consisting of: a77E, a97V, K240I, L287T or K292C. In exemplary aspects of this embodiment, the one or more properties of the modified polypeptide used in the method include sequencing properties analyzed using a next generation (high throughput) nucleic acid sequencing reaction, wherein such properties of the modified polymerase or biologically active fragment thereof are tested using an emulsion PCR template amplification reaction on a library consisting of templates with a GC content of 65%.
In certain embodiments, the polymerase used in the methods comprises 50, 75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800 consecutive amino acid residues of SEQ ID No. 1 or SEQ ID No. 34, and has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, and 99% identity to the following sequence: 1 or 34, or a biologically active fragment thereof, in certain embodiments, the modified polymerase used in the methods comprises SEQ ID NO 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33.
In further aspects of the method, the contacting, detecting, and identifying steps are repeated more than once, thereby identifying a plurality of consecutive nucleotide incorporations, wherein at least one of the nucleotides is incorporated. In certain aspects, is a reversible terminator nucleotide.
In another embodiment, provided herein is a kit having two or more containers, wherein one container comprises components for performing a nucleic acid polymerization reaction and the other container comprises a modified polymerase or a biologically active fragment thereof having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:34 exhibiting polymerase activity and exhibiting an improvement in one or more characteristics selected from thermostability and/or sequencing workflow characteristics relative to a SEQ ID NO:1 and/or SEQ ID NO:34 reference polymerase, selected from read length, accuracy, chain bias, system error and total sequencing throughput, and wherein the modified polymerase comprises one or more amino acid fetches selected from the group consisting of Generation: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A. The properties can be measured using an emulsion PCR template amplification reaction, which in certain illustrative embodiments is performed in the presence of 125mM KCl or 125mM during sample preparation for a nucleic acid sequencing reaction (e.g., a high throughput or next generation sequence reaction).
In further embodiments of the invention, the kit comprises nucleotide triphosphates, MgCl2And/or a buffer for nucleic acid polymerization reactions. The kit may further comprise reagents for a hot start mechanism. In yet another embodiment, the kit comprises components for forming an emulsion.
In some embodiments, the present invention generally relates to methods (and related kits, systems, devices, and compositions) for performing a nucleotide polymerization reaction comprising or consisting of contacting a modified polymerase or biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase, and wherein the modified polymerase or biologically active fragment thereof has improved accuracy, coverage, and/or processivity as compared to a reference polymerase; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof.
In some embodiments, the present invention generally relates to methods (and related kits, systems, devices, and compositions) for performing a nucleotide polymerization reaction comprising or consisting of contacting a modified polymerase or a biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or the biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase, and wherein the modified polymerase or the biologically active fragment thereof has increased thermostability relative to the reference polymerase; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof. In some embodiments, the method comprises polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution may comprise a solution in excess of 100mM KCl. In some embodiments, the high ionic strength solution comprises a solution of at least 120mM KCl. In some embodiments, the high ionic strength solution comprises a solution of 125mM to 200mM KCl.
In some embodiments, the method can further comprise polymerizing one of the at least one nucleotide in a template-dependent manner. In some embodiments, the polymerization is conducted under thermal cycling conditions. In some embodiments, the method may further comprise hybridizing a primer to the nucleic acid template before, during, or after the contacting, and wherein the polymerizing comprises polymerizing one of the at least one nucleotide onto the end of the primer using the modified polymerase or biologically active fragment thereof. In some embodiments, the polymerizing is performed in the vicinity of a sensor capable of detecting the polymerization of at least one nucleotide by the modified polymerase or biologically active fragment thereof. In some embodiments, the method may further comprise using a sensor to detect a signal indicative of polymerization of the at least one nucleotide by the modified polymerase or biologically active fragment thereof. In some embodiments, the sensor is an ISFET. In some embodiments, the sensor may comprise a detectable label or detectable reagent within the polymerization reaction.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 80% identity to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 90% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 95% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 98% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 90% identity to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 90% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 95% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 98% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, 3 or 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, 3 or 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, 3 or 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2, 3 or 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, 3 or 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the present invention generally relates to methods (and related kits, devices, systems, and compositions) for performing nucleic acid amplification comprising or consisting of generating an amplification reaction mixture having a modified polymerase or biologically active fragment thereof, a primer, a nucleic acid template, and one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase and has improved thermostability relative to the reference polymerase; and subjecting the amplification reaction mixture to amplification conditions, wherein at least one of the one or more nucleotides is polymerized onto the ends of the primers using the modified polymerase or biologically active fragment thereof. In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 90% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 95% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 98% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 99% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the present invention generally relates to methods (and related kits, devices, systems, and compositions) for performing nucleic acid amplification comprising, or consisting of, generating an amplification reaction mixture having a modified polymerase or biologically active fragment thereof, a primer, a nucleic acid template, and one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase and has improved accuracy relative to the reference polymerase; and subjecting the amplification reaction mixture to amplification conditions, wherein at least one of the one or more nucleotides is polymerized onto the ends of the primers using the modified polymerase or biologically active fragment thereof. In some embodiments, a modified polymerase or biologically active fragment (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the method further comprises determining the identity of one or more nucleotides polymerized by the modified polymerase. In some embodiments, the method further comprises determining the number of nucleotides polymerized by the modified polymerase. In some embodiments, at least 50% of the one or more nucleotides polymerized by the modified polymerase are identified. In some embodiments, substantially all of the one or more nucleotides polymerized by the modified polymerase are identified. In some embodiments, the polymerization occurs in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution comprises 125mM to 200mM salt. In some embodiments, the polymerization occurs in the presence of an ionic strength solution of at least 120mM salt. In some embodiments, the high ionic strength solution comprises KCl and/or NaCl.
In some embodiments, a modified polymerase or biologically active fragment thereof (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 90% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 95% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 98% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, a modified polymerase or biologically active fragment thereof (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 99% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the modified polymerase or biologically active fragment thereof further comprises at least 25 contiguous amino acids of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises at least 50 contiguous amino acid residues of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues of the polymerase DNA binding domain, while also having at least 90% identity to the sequence: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues in the polymerase DNA binding domain having at least 95% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33. In some embodiments, the methods (and related kits, systems, devices, and compositions) include amplification conditions with high ionic strength solutions. In one embodiment, the high ionic strength solution is a solution having at least 120mM KCl. In some embodiments, the high ionic strength solution comprises a solution of 125mM to 200mM KCl.
In some embodiments, the invention generally relates to methods (and related kits, systems, devices, and compositions) for performing nucleotide polymerization reactions, the methods comprising or consisting of mixing a modified polymerase or biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase (such as SEQ ID NO:1 or SEQ ID NO: 34; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof in a mixture. A high ionic strength solution refers to a reaction mixture for conducting nucleotide polymerization having at least 120mM KCl. In some embodiments, the high ionic strength solution comprises a solution of 125mM to 200mM KCl.
In some embodiments, the methods (and related kits, devices, systems, and compositions) comprise a modified polymerase or biologically active fragment thereof comprising or consisting of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 90% identity to the sequence: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 95% identity to the sequence: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 98% identity to the sequence: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for detecting nucleotide incorporation, the methods comprising or consisting of performing a nucleotide incorporation reaction using a modified polymerase or a biologically active fragment thereof having at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; generating the nucleotide incorporation; and detecting the nucleotide incorporation. Detection of nucleotide incorporation can be via any suitable means, such as PAGE, fluorescence, dPCR quantification, nucleotide byproduct generation (e.g., hydrogen ion or pyrophosphate detection; suitable nucleotide byproduct detection systems include, but are not limited to, next generation sequencing platforms such as Rain Dance, Roche 454, and ion torrent systems), or nucleotide extension product detection (e.g., optical detection of extension products or detection of labeled nucleotide extension products). In some embodiments, the methods for detecting nucleotide incorporation (and related kits, systems, devices, and compositions) include or consist of detecting nucleotide incorporation using a modified polymerase or biologically active fragment thereof that includes at least 95% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method of detecting nucleotide incorporation comprises or consists of detecting nucleotide incorporation using a modified polymerase or biologically active fragment thereof comprising at least 98% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method of detecting nucleotide incorporation comprises or consists of detecting nucleotide incorporation by a modified polymerase or biologically active fragment thereof comprising at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method further comprises determining the identity of one or more nucleotides in the nucleotide incorporation. In some embodiments, the byproduct of nucleotide incorporation is a hydrogen ion. In some embodiments, the byproduct of nucleotide incorporation is pyrophosphate. In some embodiments, the byproduct of nucleotide incorporation is a labeled nucleotide extension product. In some embodiments, the method of detecting nucleotide incorporation comprises generating nucleotide incorporation under emulsion PCR or bridge PCR conditions.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for detecting changes in ion concentration during a nucleotide polymerization reaction comprising or consisting of performing a first nucleotide polymerization reaction on a nucleic acid template or nucleic acid library in the presence of one or more nucleotides to be incorporated during the first nucleotide polymerization reaction, wherein the first nucleotide polymerization reaction comprises a modified polymerase or biologically active fragment thereof having at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and performing a second nucleotide polymerization reaction, wherein the second nucleotide polymerization reaction detects changes in the concentration of at least one type of ion during a second nucleotide polymerization reaction time course and provides a signal indicative of changes in the concentration of at least one type of ion. In some embodiments, the ions are hydrogen ions. In some embodiments, the ion is a pyrophosphate ion. In some embodiments, the signal indicative of the change in ion concentration is a relative increase in hydrogen ion production in the polymerization reaction. In some embodiments, the detection of at least one type of ion concentration change is monitored using an ISFET. In some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 150 consecutive amino acid residues of a polymerase having at least 90% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 200 contiguous amino acid residues of a polymerase having at least 95% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 250 contiguous amino acid residues of a polymerase having at least 98% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of a polymerase having at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for amplifying nucleic acids comprising or consisting of contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the amplification is performed using polymerase chain reaction, emulsion polymerase chain reaction, isothermal amplification reaction, recombinase polymerase amplification reaction, proximity ligation amplification, rolling circle amplification, or strand displacement amplification. In some embodiments, amplifying comprises clonally amplifying the nucleic acid in solution. In some embodiments, the amplifying comprises clonally amplifying the nucleic acids on a solid support, such as a nucleic acid bead, a flow cell, a nucleic acid array, or a well present on a surface of the solid support. In some embodiments, amplification is performed using a polymerase or biologically active fragment comprising a thermostable DNA polymerase. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase with improved thermostability as compared to a reference polymerase, such as SEQ ID NO:1 or SEQ ID NO: 34. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase with improved accuracy compared to a reference polymerase, such as SEQ ID NO:1 or SEQ ID NO: 34.
In some embodiments, the methods for amplifying nucleic acids (and related kits, systems, devices, and compositions) comprise contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase having an improved average read length compared to the average read length obtained using a DNA polymerase encoded by the following sequence under the same amplification conditions: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 95% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 98% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, the average read length is determined by: the read length of amplified nucleic acid obtained using one or more of the modified polymerases provided herein is analyzed across all reads to establish an average read length, and the average read length is compared to an average read length obtained using a reference polymerase.
In some embodiments, the present invention relates generally to a method for amplifying a nucleic acid, comprising or consisting of contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, amplification is performed by a polymerase or biologically active fragment with improved templating efficiency compared to a reference sample, such as SEQ ID NO:1 or SEQ ID NO: 34. In some embodiments, the method for amplifying nucleic acid comprises amplifying nucleic acid under emulsion PCR conditions. In some embodiments, the method for amplifying a nucleic acid comprises amplifying a nucleic acid under bridge PCR conditions. In some embodiments, the bridge PCR conditions comprise hybridizing one or more of the amplified nucleic acids to a solid support. In some embodiments, the hybridized one or more amplified nucleic acids can be used as a template for further amplification. In some embodiments, the modified polymerase or biologically active fragment thereof comprises the polymerase SEQ ID NO 1 derived from Thermus aquaticus DNA polymerase (Taq) is the full-length wild-type nucleic acid sequence of Thermus aquaticus (Taq) DNA polymerase. In some embodiments, Taq DNA polymerase may be used as a reference polymerase in the methods, kits, devices, systems, and compositions described herein.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for synthesizing nucleic acids comprising or consisting of incorporating at least one nucleotide onto an end of a primer using a modified polymerase or biologically active fragment thereof having at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. optionally, the method further comprises detecting incorporation of at least one nucleotide onto the end of the primer. In some embodiments, the method further comprises determining the identity of at least one of the at least one nucleotide incorporated onto the end of the primer. In some embodiments, the method can include determining the identity of all nucleotides incorporated onto the ends of the primer. In some embodiments, the method comprises synthesizing the nucleic acid in a template-dependent manner. In some embodiments, the method may comprise synthesizing the nucleic acid in solution, on a solid support, or in an emulsion (such as emPCR).
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 80% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 95% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 97% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 98% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 99% identity to seq id no:1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of SEQ ID No. 2.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 80% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 95% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 97% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 98% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 99% identity to seq id no:2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of SEQ ID No. 3.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 80% identity to seq id no:3 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:3 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 95% identity to seq id no:3 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 98% identity to seq id no:3 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of SEQ ID No. 4.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 80% identity to seq id no:4, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:4, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 95% identity to seq id no:4, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 98% identity to seq id no:4, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO:18, SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:5 and having one or more amino acid mutations selected from the group consisting of: a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:6 and having one or more amino acid mutations selected from the group consisting of: P6N, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:7 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:8, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:9 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:10 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:11, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:12 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:13, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:14, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:15, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:16, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:17, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:18, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:19, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:20, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:21 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:22, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:23, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:24, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:25 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:26, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:27, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:28, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:29, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:30 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E794C, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:31 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E805I and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:32, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C and L828A.
In some embodiments, the present invention relates generally to an isolated and purified polypeptide comprising or consisting of at least 90% identity to seq id no:33, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C and E805I.
In some embodiments, the present invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 80% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 90% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the present invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 95% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 98% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the present invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 99% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the invention relates generally to a composition comprising an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 90% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the invention relates generally to a composition comprising an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 90% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to a vector comprising an isolated nucleic sequence encoding a polypeptide or biologically active fragment thereof selected from the group consisting of: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33. in some embodiments, a vector comprising an isolated nucleic acid sequence encoding a polypeptide or biologically active fragment thereof comprises a DNA polymerase. In some embodiments, the DNA polymerase is thermus aquaticus (Taq) polymerase. In some embodiments, the DNA polymerase is a thermostable DNA polymerase. In some embodiments, the DNA polymerase is derived from a thermostable thermus aquaticus (Taq) polymerase.
In some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 80% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 95% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 98% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to a composition comprising an isolated nucleic acid having at least 80% identity to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: p6, a77, a97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, a502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, wherein numbering is specific for the following amino acid residues: 1 in SEQ ID NO.
In some embodiments, the present invention relates generally to a composition comprising an isolated nucleic acid having at least 80% identity to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific for the following amino acid residues: 1 in SEQ ID NO.
In some embodiments, the composition comprises at least 80% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: p6, a77, a97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, a502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, wherein numbering is specific for the following amino acid residues: 1 in SEQ ID NO.
In some embodiments, the composition comprises at least 80% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific for the following amino acid residues: 1 in SEQ ID NO.
In some embodiments, the composition comprises or consists of SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the composition comprises at least 85%, 90%, 95%, 98%, or 99% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific for the following amino acid residues: 1 in SEQ ID NO.
In some embodiments, the present invention relates generally to a kit comprising isolated polypeptides having at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the kit comprises isolated polypeptides having at least 90%, 95%, 96%, 97%, 98%, or 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the kit comprises an isolated polypeptide selected from the group consisting of: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the kit comprises an isolated polypeptide comprising or consisting of at least 250 contiguous amino acid residues having at least 90% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the kit comprises an isolated polypeptide comprising or consisting of at least 450 contiguous amino acid residues having at least 95% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the kit comprises an isolated polypeptide comprising or consisting of at least 650 contiguous amino acid residues having at least 98% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the kit further comprises dntps, one or more buffers and/or MgCl.
In some embodiments, the invention generally relates to a polymerase or biologically active fragment thereof having DNA polymerase activity and at least 80% identity to the following sequence: SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 and SEQ ID NO 33, wherein the polymerase or biologically active fragment having DNA polymerase activity comprises at least one sequence which is complementary to at least one of the following sequences Amino acid substitutions of ratios: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, at least one amino acid substitution compared to the sequence: SEQ ID NO 1 or SEQ ID NO 34 may confer a beneficial property to the polymerase or biologically active fragment thereof. In some embodiments, the beneficial properties conferred to the polymerase or biologically active fragment thereof (as compared to SEQ ID NO:1 or SEQ ID NO: 34) include improved thermostability, improved read length, improved templating efficiency, improved performance in high ionic strength solutions, or improved accuracy. In some embodiments, the beneficial properties conferred to the polymerase or biologically active fragment thereof (as compared to SEQ ID NO:1 or SEQ ID NO: 34) include reduced strand bias of the GC-and AT-rich nucleic acid. It is generally understood that the beneficial properties imparted to a polymerase or biological fragment (as compared to the properties of SEQ ID NO:1 or SEQ ID NO: 34) can be determined by assessing and/or measuring such properties under the same conditions (e.g., comparing the properties of SEQ ID NO:1 to the polymerase or biologically active fragment thereof under the same conditions). For example, the accuracy of a DNA polymerase can be measured with respect to the longest perfect read obtained from a nucleotide polymerization reaction (typically measured with respect to the number of nucleotides correctly included in the read). In some embodiments, the nucleotide polymerization reaction may be performed using emulsion PCR, bridge PCR, or hot start PCR conditions. In some embodiments, one or more of the beneficial properties conferred to the polymerase or biologically active fragment thereof can be determined by assessing sequencing accuracy. In some embodiments, sequencing accuracy can be determined using any next generation sequencing platform (e.g., ion torrent system, Illumina HiSeq or True Seq or X-10 system). In some embodiments, sequencing accuracy can be determined using any ISFET-based sequencing system.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, with numbering relative to SEQ ID NO: 34.
In some embodiments, the invention generally relates to a substantially purified polymerase comprising or consisting of an amino acid sequence that is a fragment of the following sequence that retains polymerase activity: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the polymerase activity (also referred to herein as a polymerase property or polymerase characteristic) is selected from primer extension activity, strand displacement activity, proofreading activity, nick initiation polymerase activity, reverse transcriptase activity precision, average read length, thermostability, processivity, strand bias, or nucleotide polymerization activity. In some embodiments, polymerase activity is selected from one or more sequencing-based metrics selected from raw read accuracy, average read length, thermostability, or processivity.
In some embodiments, the invention generally relates to a substantially purified polymerase having an amino acid sequence comprising or consisting of a biologically active fragment of the following sequence having polymerase activity: SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33, the polymerase activity selected from the group consisting of reading improved compared to the polymerase activity of the following sequences under the same conditions Length, improved accuracy or improved thermal stability: SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, polymerase activity is determined in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution is at least 120mM Kcl. In some embodiments, the high ionic strength solution is 125mM KCl to 200mM KCl.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 80% identical to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: a97, K240, L287 and K292, wherein the numbering is relative to SEQ ID NO: 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 80% identical to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: A97V, K240I, L287T and K292C, wherein numbering is relative to SEQ ID NO: 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E397 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E397V amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1, and further comprises an L763 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an L763F amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1, and further comprises an E805 amino acid substitution, wherein the numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E805I amino acid substitution, wherein the numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E745 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E745T amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E397 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E397V amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an L763 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an L763F amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E805 amino acid substitution, wherein the numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E805I amino acid substitution, wherein the numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E745 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E745T amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:1 having at least 90% identity to the following sequence: 1, wherein the recombinant polymerase comprises a mutation or combination of mutations relative to SEQ ID NO:1 selected from P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A. In some embodiments, a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:1 includes a thermostable DNA polymerase from a species other than Thermus aquaticus (Taq).
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 having at least 90% identity to: 34, wherein the recombinant polymerase comprises a mutation or combination of mutations relative to SEQ ID NO 34 selected from P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A. In some embodiments, a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 includes a thermostable DNA polymerase from a species other than Thermus aquaticus (Taq). In some embodiments, the recombinant polymerase homologous to SEQ ID No. 1 or SEQ ID No. 34 comprises a thermostable polymerase selected from the group consisting of: klentaq-235DNA polymerase, Klentaq-278DNA polymerase, Stoffel fragment, Klentaq-291DNA polymerase, Pyrococcus furiosus DNA polymerase, Pyrococcus GB-D DNA polymerase, Thermus flavus (Thermus flavus) DNA polymerase, Thermus thermophilus (Thermus thermophilus) DNA polymerase, Thermococcus lituralis (Thermococcus literalis) DNA polymerase and combinations thereof.
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 having at least 80% identity to: 34 or a biologically active fragment thereof, and wherein the recombinant polymerase comprises an E397 mutation. In some embodiments, a recombinant polymerase homologous to SEQ ID NO:34 comprises a mutation that increases processivity, increases precision, increases average read length, or improves thermostability as compared to a reference polymerase lacking the corresponding mutation. In some embodiments, increased continuous synthesis capability, increased accuracy, increased average read length, or improved thermal stability is measured using an ISFET. In some embodiments, the ISFET is coupled to a semiconductor-based sequencing platform. In some embodiments, the semiconductor-based sequencing platform is a personal genome machine or a proton sequencer (life technologies, california).
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 having at least 80% identity to: 34 or a biologically active fragment thereof, and wherein the recombinant polymerase comprises a mutation or combination of mutations relative to SEQ ID NO:34 selected from E397V, and wherein the polymerase further comprises a mutation at one or more of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is relative to SEQ ID NO: 34.
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 having at least 80% identity to: 34 or a biologically active fragment thereof, and wherein the recombinant polymerase comprises a mutation or combination of mutations relative to SEQ ID No. 34 selected from the group consisting of E397V, L763F, E805I and E745T, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a composition comprising a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID NO:34 having at least 80% identity to: 34 or a biologically active fragment thereof, and wherein the recombinant polymerase comprises a mutation or combination of mutations relative to SEQ ID NO:34 selected from the group consisting of E397V, L763F, E805I and E745T, and wherein the polymerase further comprises a mutation at one or more of: P6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E790G, E794C and L828A, wherein the numbering is relative to SEQ ID NO: 34.
In some embodiments, a recombinant polymerase or biologically active fragment thereof homologous to SEQ ID No. 1 or SEQ ID No. 34 comprises increased precision as compared to a reference polymerase lacking a mutation or combination of mutations relative to SEQ ID No. 1 or SEQ ID No. 34; or comprises an increased read length compared to a reference polymerase lacking a mutation or combination of mutations relative to SEQ ID No. 1 or SEQ ID No. 34; or comprises an increased total sequencing throughput compared to a reference polymerase lacking a mutation or combination of mutations relative to a recombinant polymerase homologous to SEQ ID NO:1 or SEQ ID NO: 34; or comprising reduced strand bias compared to a reference polymerase lacking a mutation or combination of mutations relative to SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, increased accuracy, increased read length, increased sequencing throughput, or reduced strand bias is measured using ISFETs. In some embodiments, the ISFET is coupled to a semiconductor-based sequencing platform. In some embodiments, the semiconductor-based sequencing platform is a personal genome machine or proton sequencer available from life technologies Corporation (CA).
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of a GC-rich genome, wherein the GC-rich genome is GC-rich to at least 60%, 65%, 70%, 75%, 80%, 85% or more. In some embodiments, the GC-rich genome is derived or obtained from a GC-rich organism, e.g., a bacterial genome, such as Rhodococcus (Rhodococcus) or the like. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the GC-rich genome such that, after nucleic acid sequencing, the data includes less than 100 nucleic acid gaps per gigabyte (gigabyte) of nucleic acid sequencing data. In some embodiments, the polymerase or biologically active fragment thereof has at least 80% identity to: SEQ ID NO 1 or SEQ ID NO 34, further including relative to SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, the one or more amino acid substitutions relative to SEQ ID NO:1 or SEQ ID NO:34 are selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, or L828, with numbering relative to SEQ ID NO: 1. It will be apparent to one of ordinary skill in the art that any suitable method of determining GC content is deemed sufficient. For example, GC content can be measured by spectrophotometric determination of the melting temperature of the DNA double helix. When double-stranded DNA is separated to form two single strands, the absorption of DNA at 260nm increases significantly. Other suitable methods of determining GC content include using a single GC calculator to calculate the expected melting temperature or using flow cytometry to determine the GC ratio over a large number of samples.
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of a GC-rich genome, wherein the GC-rich genome is GC-rich to at least 60%, 65%, 70%, 75%, 80%, 85% or more. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the GC-rich genome such that after nucleic acid sequencing, the data comprises less than 50 nucleic acid gaps per gigabyte of nucleic acid sequencing data. In some embodiments, the polymerase or biologically active fragment thereof has at least 80% identity to: SEQ ID NO 1 or SEQ ID NO 34, further including relative to SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, the one or more amino acid substitutions relative to SEQ ID NO:1 or SEQ ID NO:34 are selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, or L828, with numbering relative to SEQ ID NO: 1.
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of a GC-rich genome, wherein the GC-rich genome is GC-rich to at least 60%, 65%, 70%, 75%, 80%, 85% or more. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the GC-rich genome such that after nucleic acid sequencing, the data comprises less than 20 nucleic acid gaps per gigabyte of nucleic acid sequencing data. In some embodiments, the polymerase or biologically active fragment thereof has at least 80% identity to: SEQ ID NO 1 or SEQ ID NO 34, further including relative to SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, the one or more amino acid substitutions relative to SEQ ID NO:1 or SEQ ID NO:34 are selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, or L828, with numbering relative to SEQ ID NO: 1.
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of an AT-rich genome, wherein the AT-rich genome is AT least 60%, 65%, 70%, 75%, 80% or greater AT-rich. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the AT-rich genome such that, after nucleic acid sequencing, the data comprises less than 100 nucleic acid gaps per gigabyte of nucleic acid sequencing data. In some embodiments, the polymerase or biologically active fragment thereof has at least 80% identity to: SEQ ID NO 1 or SEQ ID NO 34, further including relative to SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, the one or more amino acid substitutions relative to SEQ ID NO:1 or SEQ ID NO:34 are selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, or L828, with numbering relative to SEQ ID NO: 1.
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of an AT-rich genome, wherein the AT-rich genome is AT least 60%, 70%, 75% or 80% AT-rich. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the AT-rich genome such that, after nucleic acid sequencing, the data comprises less than 50 nucleic acid gaps per gigabyte of nucleic acid sequencing data.
In some embodiments, the present invention relates generally to a composition comprising a polymerase or a biologically active fragment thereof having at least 80% identity to: 1 or 34, wherein the polymerase or biologically active fragment thereof improves sequencing coverage of an AT-rich genome, wherein the AT-rich genome is AT least 60%, 70%, 75% or 80% AT-rich. In some embodiments, the polymerase or biologically active fragment thereof improves sequencing of the AT-rich genome such that after nucleic acid sequencing, the data comprises less than 20 nucleic acid gaps per gigabyte of nucleic acid sequencing data.
In some embodiments, the present invention relates generally to a method for performing nucleic acid amplification comprising or consisting of contacting a modified polymerase with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase includes one or more amino acid substitutions relative to SEQ ID No. 1 or SEQ ID No. 34 and has increased accuracy relative to SEQ ID No. 1 or SEQ ID No. 34; and polymerizing at least one of the one or more nucleotides using the modified polymerase. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 80% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention generally relates to a method for obtaining sequence information from a nucleic acid template comprising providing a reaction mixture comprising the nucleic acid template hybridized to a sequencing primer and bound to a modified polymerase; contacting the template nucleic acid with at least one type of nucleotide triphosphate, wherein said contacting comprises incorporating one or more nucleotides from at least one type of nucleotide to the 3' end of the sequencing primer and generating an extension primer product; detecting the presence of the extended primer product in the reaction mixture, thereby determining whether nucleotide incorporation has occurred; and identifying at least one of the one or more nucleotides incorporated by the at least one type of nucleotide. In some embodiments, the method comprises a modified polymerase comprising an isolated polypeptide having at least 80% identity to: 1 and/or 34, wherein the modified polymerase comprises one or more amino acid substitutions selected from the group consisting of: p6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, or L828, with numbering relative to SEQ ID NO: 1. In some embodiments, the method can include a modified polymerase comprising an isolated polypeptide having at least 80% identity to: 1 and/or 34, wherein the modified polymerase comprises one or more amino acid substitutions selected from the group consisting of: P6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A, wherein the numbering is relative to SEQ ID NO: 1. In some embodiments, the method may comprise repeating the contacting, detecting, and identifying steps more than once, thereby identifying a plurality of sequential nucleotide incorporations. In some embodiments, the method may comprise incorporating one or more reversible terminators and/or nucleotide analogs. In some embodiments, the method can include incorporating at least one dNTP (such as dATP, dTTP, dGTP, or dCTP).
Drawings
The accompanying drawings, which are incorporated in and form a part of the specification, illustrate one or more exemplary embodiments and serve to explain the principles of the various exemplary embodiments. The drawings are exemplary and explanatory only and are not to be construed as limiting or restricting in any way.
The graphs shown in FIGS. 1A-1E provide sequencing throughput and average read length data obtained using an exemplary modified polymerase according to the present invention.
The tables and graphs of fig. 2a1, 2a2, 2B1, 2B2 provide exemplary nucleic acid sequencing data obtained using an exemplary modified polymerase according to the present invention.
The tables of FIGS. 3A-3B provide exemplary nucleic acid sequencing data obtained using an exemplary modified polymerase according to the present invention as compared to a reference polymerase (SEQ ID NO: 34).
The table of FIG. 4 provides exemplary nucleic acid sequencing data for GC content obtained using an exemplary modified polymerase according to the invention (SEQ ID NO: 2).
The tables and graphs of FIGS. 5A-5B provide exemplary nucleic acid sequencing data obtained using an exemplary modified polymerase according to the present invention, as compared to a reference polymerase (SEQ ID NO: 34).
The table of FIG. 6 provides exemplary nucleic acid sequencing data obtained using an exemplary modified polymerase according to the present invention as compared to a reference polymerase (SEQ ID NO: 34).
The graph shown in fig. 7 provides exemplary thermal stability data obtained using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 8 provides exemplary thermal stability data obtained using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 9 provides exemplary thermal stability data obtained using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 10 provides exemplary thermal stability data obtained using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 11 provides exemplary thermal stability data at 95 ℃ obtained using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 12 provides exemplary thermostability data obtained at 96 ℃ using an exemplary modified polymerase according to the present invention.
The graph shown in fig. 13 provides exemplary thermostability data obtained at 95 ℃ in the absence of trehalose using an exemplary modified polymerase according to the present invention.
FIG. 14 is a schematic overview of an exemplary thermostable activity assay performed in accordance with the present invention.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All patents, patent applications, published applications, articles and other publications mentioned herein (above and below) are incorporated by reference in their entirety. If a definition and/or description set forth herein, explicitly or implicitly is contrary to or otherwise inconsistent with any definition set forth in the patents, patent applications, published applications and other publications that are incorporated herein by reference, the definition and/or description set forth herein takes precedence over the definition incorporated by reference.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, and recombinant DNA technology, which are within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook, j, and Russell, d.w.,2001, molecular cloning: a Laboratory Manual, third edition; eds. Ausubel, F.M. et al, 2002, Short Protocols In Molecular Biology (Short Protocols In Molecular Biology), fifth edition.
It should be noted that not all of the activities described in the general description or the examples are required, that a portion of a particular activity may not be required, and that one or more other activities may be performed in addition to those described. Again, the order in which activities are listed is not necessarily the order in which the activities are performed.
In some cases, some concepts have been described with reference to specific embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of present invention.
As used herein, the term "comprising" (and any form or variant comprising, such as "comprises" and "comprises"), "having" (and any form or variant having, such as "has" and "has"), "including" (and any form or variant including, such as "includes" and "includes)") or "containing" (and any form or variant containing, such as "containing" and "containing") are inclusive or open-ended and do not exclude additional, unrecited additives, components, integers, elements or process steps. For example, a process, method, article, or apparatus that comprises a list of features is not necessarily limited to only those features but may include other features not expressly listed or inherent to such process, method, article, or apparatus.
Unless expressly stated to the contrary, "or" means an inclusive or, rather than an exclusive or. For example, condition a or B is satisfied by any one of the following: a is true (or present) and B is false (or not present), a is false (or not present) and B is true (or present), and both a and B are true (or present).
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any feature(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or feature of any or all the claims.
After reading this specification, skilled artisans will appreciate that certain features are, for clarity, described herein in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features that are, for brevity, described in the context of a single embodiment, may also be provided separately or in any subcombination. Further, reference to a value stated in a range includes each value within that range.
In addition, the use of the articles such as "a" or "the" is used to describe the elements and components described herein. This is done merely for convenience and to give a general sense of the scope of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise. Thus, the terms "a" and "an" and "the" and similar referents used herein are to be construed to cover both the singular and the plural, unless the context in which they are used indicates otherwise. Thus, the use of the words "a" or "an" or "the" when used in the claims or the specification, including in connection with the term "comprising," may mean "one," but is also consistent with the meaning of "one or more," at least one, "and" one or more than one.
As used herein, the term "polymerase" and variations thereof encompasses any enzyme that can catalyze the polymerization of nucleotides (including analogs thereof) into a nucleic acid strand. Typically, but not necessarily, such nucleotide polymerization may occur in a template-dependent manner. Such polymerases can include, but are not limited to, naturally occurring polymerases and any subunits and truncations thereof, mutant polymerases, variant polymerases, recombinant, fused, or otherwise engineered polymerases, chemically modified polymerases, synthetic molecules or components, and any analogs, homologs, derivatives, or fragments thereof that retain the ability to catalyze such polymerization. Optionally, the polymerase may be a mutant polymerase comprising one or more mutations involving the substitution of one or more amino acids with other amino acids, the insertion or deletion of one or more amino acids by the polymerase, or the joining of two or more portions of the polymerase, including joining two or more portions from different polymerase species or families. Typically, polymerases contain one or more active sites where nucleotide binding and/or catalysis of nucleotide polymerization can occur. Some exemplary polymerases include, but are not limited to, DNA polymerases (such as Phi-29 DNA polymerase, Taq polymerase, reverse transcriptase, and E.coli DNA polymerase) and RNA polymerases. As used herein, the term "polymerase" and variations thereof also refer to fusion proteins comprising at least two moieties linked to each other, wherein a first moiety comprises a peptide that can catalyze the polymerization of nucleotides into a nucleic acid strand and said first moiety is linked to a second moiety comprising a second polypeptide. In some embodiments, the second polypeptide may include a reporter enzyme or a domain that enhances processivity.
As used herein, the terms "link", "linked", and variations thereof encompass any type of fusion, linkage, adhesion, or association that is sufficiently stable to withstand use in the particular biological application in question. Such linkages may include, for example, covalent, ionic, hydrogen, dipole-dipole, hydrophilic, hydrophobic or affinity bonding, bonds or associations involving van der Waals forces, mechanical adhesion, and the like. Optionally, such linkage may occur between combinations of different molecules, including (but not limited to): between the nanoparticle and the protein; between the protein and the label; between the linker and the functionalized nanoparticle; between the linker and the protein; between nucleotides and labels, etc. Some examples of ligation may be found, for example, in Hermanson, g., (Bioconjugate technologies), second edition (2008); aslam, m., Dent, a., "biological binding: protein Coupling technology for Biomedical Sciences (Bioconjugation: Protein Coupling technologies for the Biomedical Sciences), London (London): Macmillan (1998); aslam, m., Dent, a., "biological binding: protein coupling technology for biomedical science, London: Macmillan (1998).
The term "modified" or "modified" and variations thereof as used herein with reference to a polypeptide or protein (e.g., polymerase) encompass any change in the structural, biological, and/or chemical properties of the protein. In some embodiments, the modification may comprise an amino acid sequence change of the protein. For example, the modification may optionally include one or more amino acid mutations, including (but not limited to) amino acid additions, deletions, and substitutions (including both conservative and non-conservative substitutions).
The term "conservative" and variations thereof as used herein with reference to any change in the amino acid sequence refers to an amino acid mutation in which one or more amino acids are substituted with another amino acid having highly similar properties. For example, one or more amino acids comprising a non-polar or aliphatic side chain (e.g., glycine, alanine, valine, leucine, or isoleucine) may be substituted for each other. Similarly, one or more amino acids comprising polar uncharged side chains (e.g., serine, threonine, cysteine, methionine, asparagine, or glutamine) can be substituted for each other. Similarly, one or more amino acids comprising an aromatic side chain (e.g., phenylalanine, tyrosine, or tryptophan) may be substituted for each other. Similarly, one or more amino acids comprising a positively charged side chain (e.g., lysine, arginine, or histidine) may be substituted for each other. Similarly, one or more amino acids comprising negatively charged side chains (e.g., aspartic acid or glutamic acid) can be substituted for each other. In some embodiments, the modified polymerase or biologically active fragment thereof is a variant comprising one or more of these conservative amino acid substitutions, or any combination thereof. In some embodiments, conservative substitutions for leucine include: alanine, isoleucine, valine, phenylalanine, tryptophan, methionine, and cysteine. In other embodiments, conservative substitutions for asparagine include: arginine, lysine, aspartate, glutamate, and glutamine.
Throughout the present invention, various amino acid mutations, including, for example, amino acid substitutions, are referred to using the amino acid one letter code and indicate the position of a residue within a reference amino acid sequence. In the case of amino acid substitutions, the identity of the substituents is also indicated using the amino acid one letter code. For example, reference is made to a hypothetical amino acid substitution "E397V, wherein the numbering is relative to the amino acid sequence of SEQ ID NO: 1" indicates the following amino acid substitution wherein a valine (V) residue replaces the glutamic acid (E) normally present at amino acid position 397 of the amino acid sequence: 1 in SEQ ID NO. Some of the amino acid sequences disclosed herein begin with a methionine residue ("M"), which is typically introduced at the beginning of a nucleic acid sequence encoding a peptide that is desired to be expressed in a bacterial host cell. However, it is to be understood that the invention also encompasses all such amino acid sequences starting from the second amino acid residue, without including the first methionine residue.
As used herein, the terms "identity" or "percent identity," and variations thereof, when used in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences (or subsequences, such as biologically active fragments) that are the same or a specified percentage of amino acid residues or nucleotides is the same, when compared and aligned for maximum correspondence, as measured using any one or more of the following sequence comparison algorithms: Needman-Wunsch (Needleman-Wunsch) (see, e.g., Needleman, Saul B.; and Wunsch, Christian D. (1970) 'A general method applicable to the search for similarities in amino acid sequences of two proteins (A general methods for applying to the search for peptides in the amino acid sequence of two proteins)' (Journal of Molecular Biology 48(3): 443-53); Smith-Waterman (Smith-Waterman) (see, e.g., Smith, sample F.; and Waterman, Michael S., "Identification of Common Molecular Subsequences" (1981) J. Molec. biol. 147: 195-); or BLAST (Basic Local Alignment Search Tool); see, e.g., Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ, "Basic Local Alignment Search Tool" (1990) journal of molecular biology (J Mol Biol) 215(3): 403-.
As used herein, the terms "identical" or "identity," and variations thereof, when used in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences (e.g., biologically active fragments) that have at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using a sequence comparison algorithm or by visual inspection. Substantially identical sequences are typically considered "homologous", regardless of the family from which they actually originate.
Proteins and/or protein proton sequences (e.g., biologically active fragments) are "homologous" when they are derived, naturally or artificially, from a common ancestral protein or protein sequence. Similarly, nucleic acids and/or nucleic acid sequences are homologous when they are naturally or artificially derived from a common ancestral nucleic acid or nucleic acid sequence. Homology is generally inferred from sequence similarity between two or more nucleic acids or proteins (or biologically active fragments or sequences thereof). The precise percentage of similarity between sequences that is useful for establishing homology varies with the nucleic acid and protein variations at issue, but sequence similarities as little as 25% of 25, 50, 100, 150 or more nucleic acids or amino acid residues are routinely used to establish homology. Higher levels of sequence similarity, e.g., 50%, 60%, 70%, 80%, 85%, 90%, 95%, 98%, or 99% may also be used to establish homology.
Methods for determining percent sequence similarity (e.g., BLASTP and BLASTN using default parameters) are described herein and are generally available. For sequence comparison and homology determination, typically one sequence serves as a reference sequence to which test sequences are compared. Generally, when using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Next, the sequence comparison algorithm calculates the percent sequence identity of the test sequence relative to the reference sequence based on the specified program parameters. Optimal sequence alignment for comparison can be performed, for example, by: local homology algorithms from Smith & Waterman, applied math evolution (adv. appl. math.)2:482 (1981); homology alignment algorithms from Needleman & Wunsch, journal of molecular biology (J.mol.biol.)48:443 (1970); methods for exploring similarities from Pearson & Lipman, Proc. Nat' l.Acad. Sci. USA 85:2444 (1988); by computerized implementation of these algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group,575Science Dr., Madison, Wis.); or visual inspection (see generally Current Protocols in Molecular Biology, eds., Ausubel et al, Current Protocols, Green Publishing Associates, Inc.) and John Wiley father, John Wiley & Sons, Inc., supplementary 2004).
An example of an algorithm suitable for determining percent sequence identity and sequence similarity (homology) is the BLAST algorithm, described in Altschul et al, J. mol. biol. 215: 403-. Software for performing BLAST analysis is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of "W" length in the query sequence that either match or satisfy some positive-valued threshold score "T" when aligned with a word of the same length in a database sequence. "T" is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hit points serve as seeds for initiating searches to find longer HSPs containing them. The word hit points then extend in both directions along each sequence, as long as the cumulative alignment score can be increased. Cumulative scores are calculated using the parameters "M" (reward score for a pair of matching residues; always >0) and "N" (penalty for mismatching residues; always <0) for nucleotide sequences. For amino acid sequences, cumulative scores were calculated using a scoring matrix. Decreasing the cumulative alignment score by an amount X from its maximum value reached; (ii) the cumulative score becomes 0 or less than 0 due to the accumulation of one or more negative-scoring residue alignments; or to the end of either sequence, the break-word hit point extends in all directions. The BLAST algorithm parameters "W", "T", and "X" determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses the word length (W)11, expect (E)10, cutoff 100, M-5, N-4, and a comparison of the two strands as defaults. For amino acid sequences, the BLASTP program uses the wordlength (W)3, expectation (E)10, and BLOSUM62 scoring matrices as defaults (see Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89: 10915).
In addition to calculating percent sequence identity, the BLAST algorithm performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873-5787 (1993)). One similarity measure provided by the BLAST algorithm is the smallest sum probability (P (N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences will occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability of a test nucleic acid compared to a reference nucleic acid is less than about 0.1, less than about 0.01, or less than about 0.001.
As used herein, the term "primer extension activity" and variations thereof, when used with reference to a given polymerase, encompass any in vivo or in vitro enzymatic activity characteristic of the given polymerase that involves the incorporation of a catalytic nucleotide onto the terminal 3' OH end of an extended nucleic acid molecule. Typically, but not necessarily, such nucleotide incorporation occurs in a template-dependent manner. In some embodiments, the primer extension activity of a given polymerase can be quantified as the total number of nucleotides incorporated by a unit amount of polymerase (in moles) per unit time (second) under a particular set of reaction conditions (as measured by, for example, radiometry or other suitable analysis).
As used herein, when the term "thermostable" and variations thereof are used with reference to a given polymerase, it includes any in vivo or in vitro enzymatic activity characteristic of the given polymerase that involves the incorporation of catalytic nucleotides at moderately high temperatures without loss of properties relating to catalytic nucleotide incorporation. Typically, but not necessarily, such nucleotide incorporation occurs in a template-dependent manner. In some embodiments, the thermostability of a given polymerase can be quantified as the total number of nucleotides incorporated (as measured by, for example, radiation measurement or other suitable analysis) by a unit amount of polymerase (in moles) per unit time (minute) at a given temperature (deg.c or ° f). In some embodiments, the thermostability of a given polymerase can be quantified by measuring the polymerization activity by a unit amount of polymerase (in moles) after 40 minutes of incubation at 95 ℃. In one embodiment, the thermostability of a given polymerase can be quantified by measuring the polymerization activity based on the half-life of the polymerase. For example, the half-life of Taq is greater than 2 hours at 92.5 ℃; 40 minutes at 95 ℃ and 9 minutes at 97.5 ℃ (Lawyer et al, (1993) PCR Methods applications (2 (4)) 275-87. some of the examples described herein compare the relative amounts of nucleotide polymerization by a reference polymerase and a modified polymerase (e.g., nucleotide polymerization using SEQ ID NO:1 compared to nucleotide polymerization using SEQ ID NO: 2.) in these examples, the nucleotide polymerization properties of the reference polymerase and the modified polymerase (or biologically active fragment thereof) are assessed under the same conditions, including high temperatures, such as 95 ℃, 96 ℃, or 97 ℃ for various times, such as 2 minutes, 4 minutes, 6 minutes, or 8 minutes (see, e.g., example 10, FIGS. 11-14), followed by PCR reactions using the polymerases.
Thermostable polymerases generally work best at about 70 ℃ (for Thermus aquaticus (Taq), which is 74 ℃, and Taq exhibits an insertion of approximately 2800 nucleotides/min at 70 ℃, 1400 nucleotides/min at 55 ℃, 90 nucleotides/min at 37 ℃ and about 15 nucleotides/min at 22 ℃). Polymerases from Pyrococcus furiosus (Pfu), Pyrococcus woesei (Pwo), Thermomyces maritima (Tma) and Pyrococcus hamiltonii (Thermococcus Litoralis, Tli or Vent) are also encompassed within the scope of the present invention. These polymerases exhibit substantially higher temperature stability than Thermus aquaticus (Taq).
As used herein, the term "accuracy" and variations thereof when used with reference to a given polymerase includes the longest perfect read (typically measured in terms of the number of nucleotides correctly included in the read) obtained from a nucleotide polymerization reaction. Thus, as used herein, the average read accuracy when referring to a given polymerase refers to the "average" perfect read obtained from a nucleotide polymerization reaction.
As used herein, the term "DNA binding activity" and variations thereof when used in reference to a given polymerase encompasses any in vivo or in vitro enzymatic activity characteristic of a given polymerase that involves the interaction of the polymerase with a DNA sequence in a recognition-based manner. Typically, but not necessarily, such interactions include polymerase binding, and more particularly binding of the DNA binding domain of the polymerase to the recognized DNA sequence. In some embodiments, recognition comprises binding of a polymerase to a sequence-specific or non-sequence-specific DNA sequence. In some embodiments, the DNA binding activity of a given polymerase can be quantified as the affinity of the polymerase to recognize and bind to the recognized DNA sequence. For example, when a protein-DNA complex is formed under a particular set of reaction conditions, changes in anisotropic signal (or other suitable assay) can be used to monitor and determine DNA binding activity.
As used herein, the term "biologically active fragment" and variants thereof, when used with reference to a given biomolecule, refers to any fragment, derivative, homolog, or analog of the biomolecule having in vivo or in vitro activity characteristic of the biomolecule itself. For example, a polymerase can be characterized by various biological activities, such as DNA binding activity, nucleotide polymerization activity, primer extension activity, strand displacement activity, reverse transcriptase activity, nick initiation polymerase activity, 3'-5' exonuclease (proofreading) activity, thermostability, accuracy, processivity, and the like. In some embodiments, a "biologically active fragment" of a polymerase is any fragment, derivative, homolog, or analog of a polymerase that can catalyze the polymerization of nucleotides (including homologs and analogs thereof) into a nucleic acid strand. In some embodiments, a biologically active fragment, derivative, homolog or analog of a polymerase has 10%, 20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, or 98% or more of the biological activity of the polymerase in any relevant in vivo or in vitro assay, such as a DNA binding assay, a nucleotide polymerization assay (which may be template-dependent or template-independent), a primer extension assay, a strand displacement assay, a reverse transcriptase assay, a calibration assay, a precision assay, a thermostability assay, or the like.
In some embodiments, the biological activity of the polymerase fragments is assayed by measuring the in vitro primer extension activity of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragment is assayed by measuring the in vitro polymerization activity of the fragment under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragment is assayed by measuring the in vitro thermostability of the fragment under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is analyzed by measuring the in vitro accuracy of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is assayed by measuring the ability of the fragments to continue synthesis in vitro under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is assayed by measuring the in vitro strand displacement activity of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is assayed by measuring the in vitro read length activity of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is assayed by measuring the in vitro strand bias activity of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is analyzed by measuring the in vitro proofreading activity of the fragments under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragment is analyzed by measuring the output (e.g., sequencing throughput or average read length) of an in vitro assay, such as that performed by the polymerase fragment, under defined reaction conditions. In some embodiments, the biological activity of the polymerase fragments is analyzed by measuring the output of an in vitro nucleotide polymerization reaction under defined reaction conditions (e.g., the original accuracy of the polymerase fragment incorporating the correct Watson-Crick nucleotide in the nucleotide polymerization reaction). In some embodiments, a biologically active fragment of a polymerase can include a biological activity that measures any one or more of the biological activities of the polymerase outlined herein.
In some embodiments, a biologically active fragment can include any portion of the DNA binding domain or any portion of the catalytic domain of the modified polymerase. In some embodiments, a biologically active fragment may optionally include any 25, 50, 75, 100, 150 or more amino acid residues of the DNA binding domain or catalytic domain. In some embodiments, a biologically active fragment of a modified polymerase can include at least 25 contiguous amino acid residues of the catalytic domain or DNA binding domain having at least 80%, 85%, 90%, 95%, 98%, or 99% identity to any one or more of the polymerases encompassed by the present invention. In some embodiments, the biologically active fragment of the modified polymerase may include at least 25 contiguous amino acid residues of the catalytic domain or DNA binding domain having at least 80%, 85%, 90%, 95%, 98%, or 99% identity to any one or more of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
Biologically active fragments may optionally be present in vivo, such as fragments produced by post-transcriptional processing or by translation of alternatively spliced RNA or alternatively may be produced via engineering, total synthesis or other suitable manipulation. Biologically active fragments include fragments expressed in native or endogenous cells as well as those produced in expression systems, such as bacterial, yeast, insect or mammalian cells.
In some embodiments, the present invention relates generally not only to the particular polymerases disclosed herein, but also to any biologically active fragment of such polymerases that is encompassed within the scope of the present invention. In some embodiments, biologically active fragments of any polymerase of the invention include any fragment that exhibits in vitro primer extension activity. In some embodiments, biologically active fragments of any polymerase of the invention include any fragment that exhibits in vitro DNA binding activity. In some embodiments, a biologically active fragment of any polymerase of the invention includes any fragment that retains in vitro polymerase activity. Polymerase activity can be determined by any method known in the art. For example, the determination of polymerase activity can be based on the activity of extending a primer on a template.
In some embodiments, the present invention relates generally to a modified polymerase having one or more amino acid mutations relative to a reference polymerase lacking the one or more amino acid mutations (e.g., deletions, substitutions, or additions), and wherein the modified polymerase retains in vitro polymerase activity or exhibits in vitro primer extension activity. In some embodiments, modified polymerases include any biologically active fragment of such polymerases that retains in vitro processivity or exhibits in vitro thermostable activity.
In some embodiments, the present invention relates generally to a modified polymerase having one or more amino acid mutations relative to a reference polymerase lacking the one or more amino acid mutations (e.g., deletions, substitutions, or additions), and wherein the modified polymerase retains in vitro proofreading activity. Determining whether a polymerase exhibits exonuclease activity or exhibits reduced exonuclease activity can be readily determined by standard methods. For example, a polynucleotide can be synthesized such that a detectable proportion of the nucleotides are radiolabeled. These polynucleotides can be incubated in an appropriate buffer in the presence of the polypeptide to be tested. After incubation, the polynucleotides were precipitated and exonuclease activity could be detected in the form of radioactive counts due to free nucleotides in the supernatant. As the skilled artisan will appreciate, based on any one or combination of the above biological activities, depending on the relevant application, the appropriate polymerase or biologically active fragment may be selected from those described herein.
As used herein, the term "nucleotide" and variations thereof encompass any compound that can selectively bind to or be polymerized by a polymerase. Typically, but not necessarily, selective binding of nucleotides to a polymerase is followed by polymerization of the nucleotides by the polymerase into nucleic acid strands; occasionally, however, a nucleotide may dissociate from a polymerase without becoming incorporated into a nucleic acid strand, an event referred to herein as a "non-productive" event. Such nucleotides include not only naturally occurring nucleotides but also any analogs (regardless of their structure) that can selectively bind to or be polymerized by a polymerase. While naturally occurring nucleotides typically comprise base, sugar and phosphate moieties, the nucleotides of the invention may include compounds that do not have any, some or all of such moieties. In some embodiments, the nucleotide can optionally include a chain of phosphorus atoms comprising three, four, five, six, seven, eight, nine, ten, or more phosphorus atoms. In some embodiments, the phosphorus chain may be attached to any carbon of the sugar ring, such as the 5' carbon. The phosphorus chain may be linked to the sugar with an intervening O or S. In one embodiment, one or more phosphorus atoms in the chain may be part of a phosphate group having P and O. In another embodiment, the chain With intervening phosphorus atoms of O, NH, S, methylene, substituted methylene, ethylene, substituted ethylene, CNH2、C(O)、C(CH2)、CH2CH2Or C (OH) CH2R (wherein R may be 4-pyridine or 1-imidazole) are linked together. In some embodiments, the phosphorus atoms in the chain may be provided with O, BH3Or a pendant group of S. In the phosphorus chain, the phosphorus atom having a pendant group other than O may be a substituted phosphate group. Some examples of nucleotide analogs are described in Xu, U.S. patent No. 7,405,281. In some embodiments, the nucleotide comprises a label (e.g., a reporter moiety) and is referred to herein as a "labeled nucleotide"; the labeling of labeled nucleotides is referred to herein as "nucleotide labeling". In some embodiments, the label may be in the form of a fluorescent dye attached to a terminal phosphate group (i.e., the phosphate group or substituted phosphate group furthest from the sugar). Some examples of nucleotides that may be used in the disclosed methods and compositions include, but are not limited to, ribonucleotides, deoxyribonucleotides, modified ribonucleotides, modified deoxyribonucleotides polyphosphates, deoxyribonucleotide polyphosphates, modified ribonucleotide polyphosphates, modified deoxyribonucleotide polyphosphates, peptide nucleotides, metal nucleotides, phosphonate nucleosides, and modified phosphate-sugar backbone nucleotides, analogs, derivatives, or variations of the foregoing, and the like. In some embodiments, a nucleotide may comprise a non-oxygen moiety (such as a thio or borane moiety) in place of an oxygen moiety that bridges the α phosphate and the sugar of the nucleotide, or the α and β phosphates of the nucleotide, or the β and γ phosphates of the nucleotide, or is located between any other two phosphates of the nucleotide, or any combination thereof.
As used herein, the term "nucleotide incorporation" and variations thereof includes polymerizing one or more nucleotides to form a nucleic acid strand comprising at least two nucleotides that are typically, but not necessarily, linked to each other via a phosphodiester bond, although alternative linkages may be possible in the case of particular nucleotide analogs.
As used herein, the term "processivity" and variations thereof encompasses the ability of a polymerase to retain binding to a single primer/template hybrid. As used herein, the term processivity when used with reference to a given polymerase encompasses the number of nucleotides attached to the 3 'end of the nucleic acid (e.g., the 3' -OH group of the DNA strand) by the polymerase in a single cycle. This number represents the polymerization ratio and dissociation constant (Kd) of the polymerase. In some embodiments, processivity can be measured by the number of nucleotides incorporated into a nucleic acid (e.g., sequencing primer) by the polymerase prior to dissociation of the polymerase from the primer/template hybrid. In some embodiments, the polymerase has a processive synthesis capability of at least 100 nucleotides, but in other embodiments it has a processive synthesis capability of at least 200 nucleotides, at least 300 nucleotides, at least 400 nucleotides, at least 500 nucleotides, or more. It will be appreciated by those of ordinary skill in the art that the higher the processivity of the polymerase, the more nucleotides that can be incorporated prior to dissociation and, therefore, the longer the sequence (read length) that can be obtained. In other words, polymerases with lower processivity will typically provide shorter average read lengths compared to polymerases with higher processivity. In one embodiment, a polymerase of the invention comprising one or more amino acid mutations can have improved processivity as compared to a polymerase lacking the one or more amino acid mutations.
In one exemplary assay, the processivity of a given polymerase can be measured by: the polymerase is incubated with the primer, template duplex, under nucleotide incorporation conditions, and the resulting primer extension product is resolved using any suitable method, e.g., via gel electrophoresis (resolve). The primer may optionally include a label to enhance the detectability of the primer extension product. Nucleotide incorporation reaction mixtures typically include a large excess of unlabeled competing template, ensuring that nearly all extension products are produced via a single template binding. Following such decomposition, the average amount of full-length extension product may be quantified using any suitable means, including detection of fluorescence or radiometric detection of the full-length extension product. To compare the processivity of two or more different enzymes (e.g., reference and modified polymerases), the various enzymes can be employed in parallel and separate reactions, after which the resulting full-length primer extension products can be decomposed and measured, and such measurements compared.
In other exemplary embodiments, the processivity of a given polymerase can be measured using any suitable assay known in the art, including (but not limited to) the assays described in: von Hippel, P.H., Faireld, F.R., and Dolejsi, M.K., Ann.NY. Acad.Sci., 726: 118-; bambara, R.A., Uyemura, D.and Choi, T., On the mechanism of continued synthesis of E.coli DNA polymerase I. assessment of quantification of continued synthesis capacity (On the processing mechanism of Escherichia coli DNA polymerase I. quantitative assessment of processing), J.Biol.chem., 253:413-423 (1978); comparative studies using simple programs (processing of DNA polymerases. A synthetic studding a simple procedure), "J.Biochem., 254: 1227-; nasir, m.s. and Jolley, m.e., fluorescence polarization: analytical tools for Immunoassay and Drug Discovery (Fluorescence polarization: An Analytical Tool for immunological and Drug Discovery), "combinatorial Chemistry and High Throughput Screening", 2:177-190 (1999); mestas, s.p., wells, a.j., and peesen, o.b., Fluorescence Polarization Based Screening assays for Nucleic Acid Polymerase Elongation Activity (a Fluorescence Polarization Based Screening Assay for Nucleic Acid Polymerase Activity), "analytical biochemistry (anal. biochem.),365:194-200 (2007); nikiforov, T.T., fluorescent polymerase, endonuclease and ligase assays (Fluorogenic polymerase, endonucleases, and ligand assays based on DNA substrates labeled with a single fluorophore), "Analytical Biochemistry (Analytical Biochemistry) 412: 229-; and Yan Wang, Dennis E.Prosen, Li Mei, John C.Sullivan, Michael Finney and Peter B.Vander Horn, "Nucleic Acids Research," 32(3): 1197-.
As used herein, the term "read length" or "read-length" and variations thereof refers to the number of nucleotides that are polymerized (or incorporated into an existing nucleic acid strand) by a polymerase in a template-dependent manner prior to dissociation from a template nucleic acid strand. In some embodiments, a polymerase that dissociates from a template nucleic acid strand after five times of incorporation will typically provide a sequence having a read length of 5 nucleotides, while a polymerase that dissociates from a template nucleic acid strand after 500 nucleotides of incorporation will typically provide a sequence having a read length of about 500 nucleotides. While the actual or absolute processivity (or the actual read length of the polymerization product produced by a polymerase) of a given polymerase may vary from reaction to reaction (or even within a single reaction mixture where the polymerase produces different products having different read lengths), the polymerase may be characterized by the average processivity (or average read length of the polymerization product) observed under a defined set of reaction conditions. An "error-free read length" encompasses the number of nucleotides that are successively and consecutively incorporated into a newly synthesized nucleic acid strand without error (i.e., without mismatch and/or deviation from a set of established and predictable base-pairing rules).
As used herein, the term "systematic error" or "SE" and variations thereof refers to the percentage of error present in a sequence motif containing a homopolymer of defined length, where systematic deletions occur at a specified minimum frequency on a nucleic acid strand, and where sequencing coverage occurs at a specified minimum frequency. For example, in some embodiments, the systematic error can be measured as a percentage of the error in sequence motifs containing homopolymers of length 1-6, wherein systematic deletions occur on the strands with a frequency greater than 15% when the coverage (of the sequencing operation) is equal to or greater than 20 x. In some embodiments, the systematic error is estimated as the percentage of random errors in sequence motifs containing homopolymers of length 1-6, wherein systematic deletions occur on the strands at a frequency greater than 15% when the coverage (of the sequencing operation) is equal to or greater than 20 x; such embodiments are the focus of several working examples disclosed herein. In some embodiments, the percent systematic error is reduced when using a modified polymerase as disclosed herein compared to a reference polymerase that does not contain one or more amino acid modifications (e.g., wild-type Taq polymerase). While the actual systematic error for a given polymerase may vary from reaction to reaction (or even within a single reaction mixture), the polymerase can be characterized by the percentage of systematic error observed under a defined set of reaction conditions. In some embodiments, the modified polymerases of the present application have a reduced percentage of systematic error as compared to a corresponding reference polymerase that does not have one or more amino acid modifications. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 3%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 1%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.9%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.8%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.7%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.6%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.5%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.4%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.3%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.2%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.1%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.09%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.08%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.05%. In some embodiments, a modified polymerase as disclosed herein contains a percent systematic error of less than 0.04%.
As used herein, the term "strand bias" refers to the percentage of target bases in a sequencing operation, where reads (genotyping) from one strand (e.g., the positive strand) are different from reads (genotyping) inferred from the other (e.g., the negative) strand. The coverage of a given target base can be calculated by counting the number of read bases positioned relative to the target base in the alignment. The average coverage can be calculated by taking the average of this value across each base in the target. Next, the relative coverage of a particular base is calculated as the ratio of these values. A relative coverage of 1 indicates that the particular base is covered at the expected average ratio. A relative coverage greater than 1 indicates a higher than expected coverage and less than 1 indicates a lower than expected coverage. In general, the probability of ambiguous locations increases as the readings become smaller or less accurate. The likelihood of ambiguous locations is also higher for reads derived from genomes of repetitive or low complexity regions, including some regions with extreme (high) GC content. In some embodiments, the percentage of strand bias is reduced or diminished when using a modified polymerase as disclosed herein, as compared to a reference polymerase that does not contain the corresponding one or more amino acid modifications (e.g., wild-type Taq polymerase). In some embodiments, the modified polymerases of the present application have reduced (reduced) strand bias as compared to a corresponding unmodified polymerase. While the actual strand bias of a given polymerase may vary from reaction to reaction (or even within a single reaction mixture), the polymerase can be characterized by the percentage of target bases that are observed to have no strand bias under a defined set of reaction conditions.
In some embodiments, a modified polymerase as disclosed herein comprises a percentage of target bases without strand bias greater than 25%. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 30% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 40% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 45% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 50% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 60% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 70% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 75% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 80% without strand bias. In some embodiments, a modified polymerase as disclosed herein comprises a target base percentage of about 85% without strand bias. Conversely, in some embodiments, a modified polymerase as disclosed herein can include a percentage of target bases with strand bias of about 15%. In another embodiment, a modified polymerase as disclosed herein can include a percentage of target bases with strand bias of about 20%, 25%, 30%, 35%, 40%, 45%, or 50%.
The term "signal-to-noise ratio" or "SNR" refers to the ratio of signal power to noise power. In general, SNR is a method of measuring a desired signal compared to a background noise level. In some embodiments, a "signal-to-noise ratio" may refer to the ratio of the signal power obtained during a sequencing operation compared to the background noise of the same sequencing operation. In some embodiments, the present application discloses methods, kits, devices, and compositions that provide a means of improving signal-to-noise ratio. In some embodiments, the present disclosure relates generally to a method for performing nucleic acid sequencing, comprising contacting a modified polymerase in the presence of one or more nucleotides with a nucleic acid template, wherein the modified polymerase comprises one or more amino acid modifications (e.g., substitutions) relative to a reference polymerase and has an increased signal-to-noise ratio relative to a reference polymerase that does not have the one or more amino acid modifications; and polymerizing at least one of the one or more nucleotides using the modified polymerase.
In some embodiments, the present invention generally relates to compositions, methods, systems, devices, and kits comprising a modified polymerase characterized by increased processivity, increased read length (including error-free read length), increased total sequencing throughput, improved thermostability, and/or increased accuracy compared to its unmodified counterpart (e.g., a reference polymerase); and to methods for making and using such modified polymerases for a wide range of biological and chemical reactions, such as nucleotide polymerization, primer extension, nucleic acid library generation, and nucleic acid sequencing reactions.
In some embodiments, the present invention generally relates to compositions, methods, systems, devices, and kits comprising a modified polymerase characterized by reduced strand bias and/or reduced systematic error as compared to its unmodified counterpart (e.g., a reference polymerase); and to methods for making and using such modified polymerases for a wide range of biological and chemical reactions, such as nucleotide polymerization, primer extension, nucleic acid library generation, and nucleic acid sequencing reactions.
In some embodiments, a modified polymerase encompassed within the scope of the invention includes one or more amino acid mutations (e.g., amino acid substitutions, additions, or deletions) relative to the corresponding counterpart lacking the same mutation. In some embodiments, the term "accuracy" as used herein can be measured by determining the incorporation rate of an incorrect nucleotide during polymerization as compared to the incorporation rate of the incorrect nucleotide during polymerization. In some embodiments, the rate of incorporation of incorrect nucleotides may be greater than 0.3, 0.4, 0.5, 0.6, 0.7 seconds or greater under elevated salt conditions (e.g., high ionic strength solution) as compared to standard (low ionic strength solution) salt conditions. While not wishing to be bound by any particular theory, applicants have found that the presence of elevated salts during polymerization slows the rate of incorrect nucleotide incorporation, resulting in a slower incorporation constant for the incorrect nucleotide. In some embodiments, the modified polymerases of the invention have enhanced accuracy as compared to a reference polymerase lacking the corresponding mutation; optionally, the modified polymerase or biological fragment thereof has enhanced precision (compared to a reference polymerase lacking the corresponding amino acid mutation) in the presence of a high ionic strength solution. Generally, as used herein, a standard ionic strength solution refers to an ionic solution having less than 120mM salt. In another embodiment, a standard ionic strength solution as used herein refers to an ionic solution having less than 100mM salt.
In some embodiments, the present invention relates generally to a modified polymerase that retains polymerase activity and/or primer extension activity in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution may be at least 120mM salt concentration. In some embodiments, the high ionic strength solution is a salt concentration of 125mM to 200 mM. In some embodiments, the salt may include potassium and/or sodium salts, such as KCl and/or NaCl. It will be apparent to the skilled person that various other suitable salts may be used instead of or in combination with KCl and/or NaCl. In some embodiments, the ionic strength solution may further comprise a sulfate.
In some embodiments, the modified polymerase can amplify and/or sequence a nucleic acid molecule in the presence of a high ionic strength solution. In some embodiments, the modified polymerase is capable of amplifying (and/or sequencing) a nucleic acid molecule in the presence of a high ionic strength solution to a greater extent (e.g., as measured by "accuracy") than a reference polymerase lacking one or more of the corresponding mutations (or homologous mutations) under the same conditions. In some embodiments, the modified polymerase is capable of amplifying (and/or sequencing) a nucleic acid molecule in the presence of a high ionic strength solution to a greater capacity (e.g., as measured by "accuracy") as compared to a reference polymerase lacking one or more of the corresponding mutations (or homologous mutations) under standard ionic strength conditions (i.e., low ionic strength as compared to a high ionic strength solution).
In some embodiments, the present invention generally relates to a modified polymerase or biologically active fragment thereof that can undergo nucleotide polymerization or nucleotide incorporation in the presence of high ionic strength conditions compared to a reference polymerase under the same conditions.
In some embodiments, the present invention relates generally to a modified polymerase or biologically active fragment thereof having increased accuracy or increased processivity in the presence of high ionic strength conditions compared to a reference polymerase under the same conditions.
In some embodiments, the present invention relates generally to a modified polymerase or biologically active fragment thereof that can detect changes in ionic concentration during nucleotide polymerization in the presence of high ionic strength salt conditions as compared to a reference polymerase under the same conditions.
In some embodiments, the present invention generally relates to a modified polymerase or biologically active fragment thereof that can amplify or sequence a nucleic acid molecule in the presence of a high ionic strength solution.
In some embodiments, the present invention generally relates to a modified polymerase or biologically active fragment thereof with increased accuracy compared to a reference polymerase under the same conditions.
In some embodiments, the present invention generally relates to methods, compositions, systems, and kits comprising the use of such modified polymerases in nucleotide polymerization reactions, including nucleotide polymerization reactions in which sequence information is obtained from a nucleic acid molecule. In some embodiments, the present invention generally relates to methods, compositions, systems, and kits comprising the use of such modified polymerases in clonal amplification reactions, including nucleic acid library synthesis. In some embodiments, the invention relates to methods for using such modified polymerases in ion-based nucleic acid sequencing reactions, wherein sequence information is obtained from a template nucleic acid using an ion-based sequencing system. In some embodiments, the present invention generally relates to compositions, methods, systems, kits, and apparatuses for performing a variety of label-free DNA sequencing reactions (e.g., ion-based sequencing reactions) using large-scale arrays of electronic sensors, such as field effect transistors ("FETs").
In some embodiments, the present invention generally relates to compositions (and related methods, systems, kits, and devices using such compositions) comprising a modified polymerase that includes at least one amino acid modification (e.g., amino acid substitution, addition, deletion, or chemical modification) relative to a reference polymerase (wherein the reference polymerase does not include the at least one amino acid modification), wherein the modified polymerase is optionally characterized by a change (e.g., an increase or decrease) in any one or more of the following properties relative to a reference polymerase: thermal stability, read length, accuracy, chain bias, systematic error, total sequencing throughput, performance in salt (i.e., ionic strength), and processivity.
As used herein, the terms "Q17" or "Q20" and variations thereof, when used with reference to a given polymerase, refer to certain aspects of the performance of the polymerase, particularly the accuracy in a given polymerase reaction, e.g., a polymerase-based sequencing-by-synthesis reaction. For example, in a particular sequencing reaction, an accuracy metric value may be calculated via a predictive algorithm or via actual alignment to a known reference genome. The predicted mass fraction ("Q" fraction) can be derived from algorithms that look at the intrinsic properties of the input signal, and obtain extremely accurate estimates as to whether a given single base included in the sequencing "read" will align. In some embodiments, such predicted mass scores may be suitable for filtering and removing lower mass reads prior to downstream alignment. In some embodiments, accuracy may be reported in terms of a Phred-like Q-score that measures accuracy on a logarithmic scale such that: q10-90%, Q17-98%, Q20-99%, Q30-99.9%, Q40-99.99%, and Q50-99.999%. Phred mass fraction ("Q") is defined as a property related to the base call error probability ("P") logarithm. The formula usually given for calculating "Q" is Q10 × log10 (1/error rate). In some embodiments, data obtained from a given polymerase reaction may be filtered to measure only polymerase reads that measure "N" nucleotides or longer and have a Q score, e.g., Q10, Q17, Q100 (referred to herein as "NQ 17" score), that exceeds a certain threshold. For example, a 100Q20 score may indicate the number of reads obtained from a given reaction that are at least 100 nucleotides in length and the Q score is Q20 (99%) or greater. Similarly, a 200Q20 score may indicate a number of reads that are at least 200 nucleotides in length and the Q score is Q20 (99%) or greater.
In some embodiments, the accuracy can also be calculated based on a proper alignment using the reference genomic sequence, referred to herein as "raw" accuracy. This is one-way accuracy involving measuring the "true" per base error associated with a single read, as opposed to measuring the error rate as a common sequence as a result of multiple reads. Raw accuracy measurements can be reported in terms of "AQ" scores (for alignment quality). In some embodiments, data obtained from a given polymerase reaction may be filtered to measure only polymerase reads that measure "N" nucleotides or longer, with AQ scores exceeding a certain threshold, e.g., AQ10, AQ17, AQ100 (referred to herein as "NAQ 17" scores). For example, a 100AQ20 score may indicate the number of reads obtained from a given polymerase reaction, which is at least 100 nucleotides in length and the AQ score is AQ20 (99%) or greater. Similarly, a 200AQ20 score may indicate a length of at least 200 nucleotides and an AQ score is the number of reads of AQ20 (99%) or greater.
In some embodiments, the accuracy of the polymerase (including, for example, accuracy in a given sequencing reaction) can be measured with respect to the total number of "perfect" (i.e., zero error) reads of greater than 100, 200, 300, 400, 500, 750, 1000, 5000, 10000, 100000 nucleotides in length obtained from the polymerase reaction.
In some embodiments, the accuracy of the polymerase can be measured with respect to the longest perfect read obtained from the polymerase reaction (typically measured with respect to the number of nucleotides included in the read).
In some embodiments, the accuracy of the polymerase can be measured with respect to the fold increase in sequencing throughput obtained in a given sequencing reaction. For example, in some embodiments, the increased accuracy of an exemplary modified polymerase encompassed by the scope of the invention as compared to a reference polymerase (or an unmodified naturally occurring polymerase) can be 2-fold, 5-fold, 10-fold, 20-fold, 50-fold, 75-fold, 100-fold, 150-fold, 200-fold, 400-fold, 500-fold, or greater accuracy.
In some embodiments, the accuracy of the polymerase can be measured in terms of the percent increase in templating efficiency obtained in a given polymerization reaction. For example, in some embodiments, the increased accuracy of an exemplary modified polymerase encompassed within the scope of the invention as compared to a reference polymerase under the same polymerization conditions can be 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50% or greater accuracy.
Some exemplary non-limiting descriptions of accuracy metrics may be found in: ewing B, Hillier L, Wendl MC, Green p. (1998) Base-calling of automated sequencer using phred. i. accuracy assessment (i. access assessment.) Genome research (ome Res.) 8(3): 175-; ewing B, Green p. (1998) base calls followed using phred automated sequencers ii. error probabilities (ii. error probabilities) genome studies 8(3) 186-; dear S, Staden R (1992), Standard File Format for data from DNA sequencing instruments, DNA sequences, 3, 107-110; bonfield JK, Staden R (1995): application of numerical estimates of base calling accuracy to DNA sequencing protocols (The application of Nucleic Acids of base calling to DNA sequencing projects.) (Nucleic Acids Res.) (25.4.1995); 23(8) 1406-10, which are incorporated herein by reference in their entirety.
In some embodiments, the accuracy of a given set of polymerases (including any of the reference or modified polymerases described herein) can be measured in an ion-based sequencing reaction; such accuracies can optionally be compared to each other to determine whether a given amino acid mutation increases or decreases sequencing accuracy relative to a reference and/or unmodified polymerase. In some embodiments, the accuracy of the one or more polymerases can use any ion-based sequencing equipment supplied by ion torrent technology (life technologies, ca), including, for example, ion torrent PGM TMOr ProtonTMA sequencer, optionally using sequencing protocols and reagents provided by an ion torrent system. Some examples of accuracy calculations using ion-based sequencing systems are described in the annotation of ion torrent applications entitled: "ion torrent: ion Personal Genome MachineTMPerformance overview, 2011 spring performance "(life technologies, South San Francisco, California), which is incorporated herein by reference in its entirety. In some embodiments, the accuracy of one or more modified polymerases prepared according to the present disclosure can be determined using any suitable method and/or any suitable next generation sequencing platform (e.g., Roche 454GS or Illumina HiSeq, MiSeq, or HiSeq X Ten platforms).
As used herein, the terms "off-rate constant" and "off-time constant" when used with reference to a given polymerase refer to the time constant ("koff") for dissociation of the polymerase from a nucleic acid template under a defined set of reaction conditions. Some exemplary assays for measuring the dissociation time constant of a polymerase are further described below. In some embodiments, the dissociation time constant may be inverse time, e.g., sec -1Or min-1Measured in units.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for using an isolated modified polymerase including at least one amino acid modification relative to a reference polymerase (which lacks the at least one amino acid modification) and for increasing the average read length of primer extension products in a primer extension reaction using the modified polymerase relative to the average read length of primer extension products obtained using the reference polymerase under the same conditions. In some embodiments, the isolated modified polymerase increases the average error-free read length of a primer extension product in a primer extension reaction using the modified polymerase relative to the average error-free read length of a primer extension product obtained using a corresponding polymerase lacking one or more amino acid modifications. In some embodiments, the isolated polymerase having at least one amino acid modification increases the average error-free read length relative to a reference polymerase by at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70% or more compared to the average error-free read length of a reference polymerase lacking the at least one amino acid modification under the same conditions. Optionally, the modified polymerase includes one or more amino acid substitutions relative to the unmodified polymerase. In some embodiments, the modified polymerase includes two or more amino acid substitutions relative to a reference polymerase lacking the two or more amino acid substitutions. In some embodiments, the primer extension reaction is an ion-based sequencing reaction. In some embodiments, the primer extension reaction is an emPCR-based amplification reaction. In some embodiments, the primer extension reaction is a bridge PCR amplification reaction. In some embodiments, the primer extension reaction includes a label, such as a reversible terminator, in the primer extension reaction.
In some embodiments, the reference polymerase is a naturally occurring or wild-type polymerase. In some embodiments, the reference polymerase is a naturally occurring thermostable DNA polymerase. In some embodiments, the reference polymerase is full length wild-type Taq DNA polymerase. In some embodiments, the reference polymerase is a truncated, but amino acid unmodified Taq DNA polymerase (e.g., Klentaq-235DNA polymerase). In other embodiments, the reference polymerase includes a derivative, truncated, mutant, or variant form of a naturally occurring polymerase that is different from the modified polymerase. For example, a reference polymerase can omit one or more amino acid mutations (e.g., one or more substitutions, deletions, or additions) as compared to a modified polymerase.
In some embodiments, the present invention relates generally to a method for performing a nucleotide polymerization reaction, comprising: contacting the modified polymerase with a nucleic acid template in the presence of one or more nucleotides; and polymerizing at least one of the one or more nucleotides using the modified polymerase. The polymerizing optionally further comprises polymerizing at least one nucleotide in a template-dependent manner. In some embodiments, the modified polymerase includes one or more amino acid substitutions relative to a reference polymerase that does not include one or more amino acid substitutions.
In some embodiments, the method further comprises hybridizing a primer to the template before, during, or after the contacting. The polymerizing may include polymerizing at least one nucleotide onto one end of the primer using a modified polymerase.
In some embodiments, the polymerizing is performed in the vicinity of a sensor capable of detecting the polymerization of at least one nucleotide by the modified polymerase.
In some embodiments, the method further comprises using a sensor to detect a signal indicative of at least one of the one or more nucleotides being polymerized by the modified polymerase.
In some embodiments, the modified polymerase, the reference polymerase, or both are DNA polymerases. The DNA polymerase may include, but is not limited to, bacterial DNA polymerase, prokaryotic DNA polymerase, eukaryotic DNA polymerase, archaeal DNA polymerase, viral DNA polymerase, or bacteriophage DNA polymerase.
In some embodiments, the DNA polymerase is selected from the group consisting of: family a DNA polymerases, family B DNA polymerases, mixed polymerases, unclassified DNA polymerases, and RT family polymerases, as well as variants and derivatives thereof.
In some embodiments, the DNA polymerase is a family a DNA polymerase selected from the group consisting of: pol type I DNA polymerases, such as E.coli DNA polymerase, Klenow fragment of E.coli DNA polymerase, Bst DNA polymerase, Taq DNA polymerase, Platinum Taq DNA polymerase series, Omni Klen Taq DNA polymerase series, T7 DNA polymerase and Tth DNA polymerase. In some embodiments, the DNA polymerase is Bst DNA polymerase. In other embodiments, the DNA polymerase is e. In some embodiments, the DNA polymerase is a klenow fragment of e. In some embodiments, the polymerase is Taq DNA polymerase. In some embodiments, the polymerase is T7 DNA polymerase.
In other embodiments, the DNA polymerase is a family B DNA polymerase selected from the group consisting of: bst polymerase, Tli polymerase, Pfu turbo polymerase, Pyrobest polymerase, Pwo polymerase, KOD polymerase, Sac polymerase, Sso polymerase, Poc polymerase, Pab polymerase, Mth polymerase, Pho polymerase, ES4 polymerase, VENT polymerase, DEEPVENT polymerase, TherminatorTMPolymerase, phage Phi29 polymerase, and phage B103 polymerase. In some embodiments, the polymerase is KOD polymerase. In some embodiments, the polymerase is a TherminatorTMA polymerase. In some embodiments, the polymerase is a bacteriophage Phi29DNA polymerase. In some embodiments, the polymerase is a bacteriophage B103 polymerase, including, for example, variants disclosed in U.S. patent publication No. 20110014612, which is incorporated herein by reference in its entirety.
In other embodiments, the DNA polymerase is a hybrid polymerase selected from the group consisting of: EX-Taq polymerase, LA-Taq polymerase, extended polymerase series, and Hi-Fi polymerase. In yet other embodiments, the DNA polymerase is an unclassified DNA polymerase selected from the group consisting of: tbr polymerase, Tfl polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tih polymerase, and Tfi polymerase.
In other embodiments, the DNA polymerase is a Reverse Transcriptase (RT) polymerase selected from the group consisting of: HIV reverse transcriptase, M-MLV reverse transcriptase, and AMV reverse transcriptase. In some embodiments, the polymerase is HIV reverse transcriptase or a fragment thereof having DNA polymerase activity and/or primer extension activity.
Suitable bacterial DNA polymerases include, but are not limited to, e.coli DNA polymerase I, II and III, IV, and V; klenow fragment of E.coli DNA polymerase; clostridium stercorarium (Cst) DNA polymerase, Clostridium thermocellum (Cth) DNA polymerase, Bacillus stearothermophilus (Bst) DNA polymerase, and Sulfolobus solfataricus (Sso) DNA polymerase.
Suitable eukaryotic DNA polymerases include, but are not limited to, DNA polymerases α, δ, ε, η, ζ, γ, β, σ, λ, μ, ι, and κ, and Rev1 polymerase (terminal deoxycytidine acid transferase) and terminal deoxynucleotidyl acid transferase (TdT).
Suitable viral and/or phage DNA polymerases include, but are not limited to, T4 DNA polymerase, T5 DNA polymerase, T7 DNA polymerase, Phi-15 DNA polymerase, Phi-29 DNA polymerase (see, e.g., U.S. patent No. 5,198,543; also variously referred to as Φ 29 polymerase, Phi29 polymerase, Phi29 polymerase, and Phi29 polymerase); Φ 15 polymerase (also referred to herein as Phi-15 polymerase); Φ 21 polymerase (Phi-21 polymerase); PZA polymerase; PZE polymerase; PRD1 polymerase; nf polymerase; M2Y polymerase; SF5 polymerase; f1 DNA polymerase; cp-1 polymerase; cp-5 polymerase; cp-7 polymerase; PR4 polymerase; PR5 polymerase; PR722 polymerase; an L17 polymerase; m13 DNA polymerase; RB69DNA polymerase; g1 polymerase; GA-1 polymerase; BS32 polymerase; b103 polymerase; a polymerase or derivative thereof obtained from any phi-29-like bacteriophage, and the like. See, e.g., U.S. patent No. 5,576204 filed on 11/2/1993; U.S. patent application No. 2007/0196846, published on 23/8/2007.
Suitable archaeal DNA polymerases include, but are not limited to, thermostable and/or thermophilic DNA polymerases, such as DNA polymerases isolated from: thermus aquaticus (Taq) DNA polymerase, Thermus filamentous (Tfi) DNA polymerase, Thermus thermophilus (Tth) DNA polymerase, Thermus flavus (Tfl) DNA polymerase, Pyrococcus zilligi (Tzi) DNA polymerase, Pyrococcus thermophilus (Tth) DNA polymerase, Pyrococcus flavus (Tfl) DNA polymerase, Pyrococcus woesei (Pwo) DNA polymerase, Pyrococcus furiosus (Pfu) DNA polymerase, and Pfuu Pfuvus DNA polymerase, Pyrococcus furiosus (JDu) DNA polymerase, Pyrococcus furiosus (Tli) DNA polymerase or Vent DNA polymerase, Pyrococcus species GB-D polymerase ("Deep" DNA polymerase, New Engli biolabella), Pyrococcus furiosus (JD, Tli) DNA polymerase or Vent DNA polymerase, Pyrococcus koshii DNA polymerase, Pyrococcus sp DNA polymerase, Pyrococcus furiosus 3 DNA polymerase, Pyrococcus furiosus sp (JDu) DNA polymerase, Pyrococcus furiosus sp DNA polymerase, Pyrococcus furiosus sp (JDu) DNA polymerase, Pyrococcus furiosus, Pyrococcus furiosus 3, Pyrococcus furiosus sp (JD) DNA polymerase, Pyrococcus koshimex DNA polymerase, Pyrococcus furiosus, Pyrococcus furiosus (Tsu DNA polymerase, Pyrococcus furiosus) DNA polymerase, Pyrococcus furiosus, Pyrococcus furiosus, Pyrococcus furiosus, Pyrus furiosus, Pyrus, Pyrococcus furiosus, Pyrus furiosus, Pyrus furiosus, Plo, Pyrus furiosus, Plo, Pyrococcus furiosus, Plo, P, Thermococcus gorgonearius (Tgo) DNA polymerase, Thermococcus acidophilum DNA polymerase; sulfolobus acidocaldarius DNA polymerase; pyrococcus species 9 ℃ N-7DNA polymerase; pyrococcus species NA 1; DNA polymerase of Pyrolusitum occulta (Pyrodidium occullum); methanococcus vorexai (Methanococcus voltaeus) DNA polymerase; methanobacterium thermoautotrophic (Methanococcus thermoautotrophicum) DNA polymerase; methanococcus jannaschii (Methanococcus jannaschii) DNA polymerase; thiococcus (desulfococcus) strain TOK DNA polymerase (d.tok Pol); pyrococcus profundus abyssi (Pyrococcus abyssi) DNA polymerase; horikoshi's fire coccus (Pyrococcus horikoshii) DNA polymerase; pyrococcus islandicum (Pyrococcus islandicum) DNA polymerase; thermococcus furiosus (Thermococcus fumosolins) DNA polymerase; aeropyrum pernix (Aeropyrum pernix) DNA polymerase; heterodimeric DNA polymerase DP1/DP2, and the like.
In some embodiments, the modified polymerase is an RNA polymerase. Suitable RNA polymerases include, but are not limited to, T3, T5, T7, and SP6 RNA polymerases.
In some embodiments, the polymerase is a Reverse Transcriptase (RT). Suitable reverse transcriptases include, but are not limited to, reverse transcriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV, SIV, AMV, MMTV and MoMuLV as well as commercially available "superscript" reverse transcriptases (Life technologies, Calif.) and telomerase.
In some embodiments, the modified polymerase is derived from a known DNA polymerase. DNA polymerases have been classified into seven different families based on both amino acid sequence comparison and three-dimensional structure analysis. The DNA polymerase I (pol I) or A-type polymerase families include the repair-type polymerase E.coli DNA pol I, Thermus aquaticus pol I, and Bacillus stearothermophilus pol I, replicative DNA polymerases from several bacteriophages (T3, T5, and T7), and eukaryotic mitochondrial DNA polymerases. The DNA polymerase α (pol α) or B-type polymerase family includes all eukaryotic replicating DNA polymerases as well as archaeal DNA polymerases, viral DNA polymerases, DNA polymerases encoded in mitochondrial plasmids of various fungi and plants, and polymerases from bacteriophage T4 and RB 69. Family C polymerases are primary bacterial chromosomal replication-type enzymes. These are sometimes considered to be a subset of family Y, which contains the eukaryotic polymerase pol β, as well as other eukaryotic polymerases such as pol σ, pol λ, pol μ, and terminal deoxynucleotidyl transferase (TdT). Family D polymerases are all found in the eurocladium subdomain of archaea and are considered replication-competent polymerases. Family Y polymerases are called trans-lesion synthesis (TLS) polymerases due to their ability to replicate via impaired DNA. It is also called an error-prone polymerase because it has lower fidelity on undamaged templates. This family includes Pol η, Pol ζ, Pol ι (iotaota), Pol κ (kappa) and Rev1 as well as Pol IV and Pol V from E.coli. Finally, the reverse transcriptase family includes reverse transcriptases from retroviruses and eukaryotic polymerases, typically limited to telomerase. These polymerases use an RNA template to synthesize a DNA strand, and are also referred to as RNA-dependent DNA polymerases.
In some embodiments, the modified polymerase or biologically active fragment thereof can be prepared using any suitable method or assay known to those of skill in the art. In some embodiments, any suitable method of performing protein engineering to obtain a modified polymerase or biologically active fragment thereof is encompassed within the scope of the present invention. For example, site-directed mutagenesis is a technique that can be used to introduce one or more known or random mutations into a DNA construct. The introduction of one or more amino acid mutations can be verified, for example, relative to a standard or reference polymerase or via nucleic acid sequencing. After validation, constructs containing one or more of the amino acid mutations can be transformed into bacterial cells and expressed.
Typically, colonies containing the mutant expression constructs are inoculated into culture media, induced and grown to the desired optical density, followed by collection (usually via centrifugation) and purification of the supernatant. It will be immediately apparent to the skilled artisan that the supernatant may be purified by any suitable means. Typically, a column is selected for analytical or preparative protein purification. In some embodiments, the modified polymerase or biologically active fragment thereof prepared using the methods can be (but is not limited to) purified on a heparin column, primarily according to the manufacturer's instructions.
After purification, the modified polymerase or biologically active fragment thereof can be assessed for various polymerase activities, properties, or characteristics using any suitable method. In some embodiments, the polymerase activity, characteristic, or feature assessed will depend on the relevant application. For example, a property of a polymerase used for amplification or sequencing of a nucleic acid molecule of about 300 to about 600bp in length, such as increased processivity and/or increased read length, can be analyzed relative to a reference polymerase lacking one or more amino acid modifications (e.g., substitutions, deletions, or additions). In another example, applications requiring deep targeted re-sequencing of nucleic acid molecules of about 100bp in length may include polymerase properties such as increased primary accuracy, increased total sequencing throughput, reduced strand bias, or reduced systematic error. In some embodiments, the one or more polymerase properties assessed may be correlated with polymerase performance or polymerase activity in the presence of a high ionic strength solution (e.g., at least 120mM salt).
In some embodiments, a modified polymerase or biologically active fragment thereof prepared according to the methods disclosed herein can be assessed for DNA binding activity, nucleotide polymerization activity, primer extension activity, strand displacement activity, reverse transcriptase activity, 3'-5' exonuclease (proofreading) activity, and the like.
In some embodiments, a modified polymerase or biologically active fragment thereof prepared according to the methods can be assessed for increased accuracy, increased processivity, increased mean read length, increased minimum read length, increased total sequencing throughput, decreased strand bias, decreased systematic error, {4 increased AQ20, increased 200Q17 value, or the ability to perform nucleotide polymerization, as compared to a reference polymerase under identical conditions. In some embodiments, the modified polymerase or biologically active fragment thereof can be assessed for any polymerase activity in the presence of a high ionic strength solution (e.g., a salt solution having at least 120mM salt, such as NaCl and/or KCl).
In some embodiments, the modified polymerase or biologically active fragment thereof is optionally characterized by a change (e.g., an increase or decrease) in any one or more of the following properties (often relative to a polymerase lacking the corresponding one or more amino acid mutations): the dissociation time constant, the rate of dissociation of the polymerase from a given nucleic acid template, the binding affinity of the polymerase to the given nucleic acid template, and characteristics associated with the nucleic acid sequencing reaction, such as average read length, minimum read length, accuracy, total number of perfect reads, total sequencing throughput, strand bias, systematic error, fold increase in sequencing reaction throughput, salt performance (i.e., ionic strength), AQ20, average error-free read length, error rate, 100Q17 value, 200Q17 value, Q score, raw read accuracy, and processivity. It is to be understood that in illustrative embodiments of the invention, the modified polymerase is used in an emulsion PCR reaction to amplify the template as part of a sequencing workflow, for example, to amplify the template on a solid support, and in some illustrative embodiments, to clonally amplify the template on the solid support. Methods for making emulsions and performing emulsion PCR are known in the art. Compounds for making emulsions (e.g. biologicals) Compatible oils and emulsion stabilizers) are commercially available (e.g., Sigma, st. louis MO, st.; uniqema, New Jersey). The nucleic acid sequence of at least a portion of the amplified template is then determined. For the emulsion PCR template amplification step, the results of this sequencing are compared to the results of a similar experiment with a reference polymerase, such as Taq polymerase (SEQ ID NO:1) or a modified Taq polymerase of SEQ ID: 34. The examples provided herein demonstrate the performance of specific examples of such comparative testing. May be substantially according to Ion XpressTMThe protocol provided in the user guide for Template set v 2.0 (Ion torrent system, part number 4469004a) and the Reagents provided in the Ion Template Preparation set (Ion torrent system/life technology company, part number 4466461), Ion Template Reagents set (Ion Template Reagents Kit) (Ion torrent system/life technology company, part number 4466462) and Ion Template Solutions set (Ion Template Solutions Kit) (Ion torrent system/life technology company, part number 4466463) were used to amplify libraries of nucleic acids to Ion Sphere libraries TMOn the particle (ion torrent system, part number 602-1075-01), except that the tested polymerase or the reference polymerase can be used instead of the polymerase provided in the kit and the results of the tested polymerase can be compared to those produced with the reference polymerase. Followed by loading of the amplified nucleic acid molecule into PGMTM314 sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923).
In some embodiments, the modified polymerase or biologically active fragment thereof can be assessed individually relative to known values in the art for similar polymerases. In some embodiments, a modified polymerase or biologically active fragment thereof prepared according to the methods disclosed herein can be assessed under similar or identical conditions relative to a known or reference polymerase. In some embodiments, the conditions may include amplification or sequencing of nucleic acid molecules in the presence of a high ionic strength solution.
In some embodiments, the present invention generally relates to methods for producing a plurality of modified polymerases or biologically active fragments. In some embodiments, the present invention generally relates to methods of producing a plurality of modified polymerases or biologically active fragments using high throughput or automated systems. In some embodiments, the method comprises mixing a plurality of modified polymerases or biologically active fragments with a series of reagents required for protein purification and extracting the purified polymerases or biologically active fragments from the mixture. In one example, multiple random or site-directed mutagenesis reagents can be prepared in 96-well or 384-well culture plates. Optionally, the contents of a 96-well or 384-well culture plate can be subjected to an initial screening to identify polymerase mutant constructs. The contents of each individual well (or the contents of each well from the initial screening) can be delivered to a series of flasks, tubes, or shakers for inoculation and induction. At the desired optical density, the flask, tube or shaker may be centrifuged and the supernatant recovered. Each supernatant can be subjected to protein purification, for example via a fully automated column (see, e.g., Camper and Viola, analytical biochemistry, 2009, page 176-181). The purified modified polymerase or biologically active fragment can be assessed for one or a combination of polymerase activities, such as DNA binding, primer extension, strand displacement, reverse transcriptase activity, and the like. It is envisaged that the skilled person may use the method (or variations of the method within the scope of the invention) to identify a plurality of modified polymerases or biologically active fragments. In some aspects, the methods can be used to identify a plurality of modified polymerases or biologically active fragments with enhanced accuracy compared to a reference polymerase under the same conditions. In some embodiments, the methods can be used to identify a plurality of modified polymerases or biologically active fragments thereof with enhanced accuracy in the presence of a high ionic strength solution. In some aspects, the methods can be used to identify a plurality of modified polymerases or biologically active fragments having enhanced read lengths compared to a reference polymerase under the same conditions. In some embodiments, the methods can be used to identify a plurality of modified polymerases or biologically active fragments thereof with enhanced read lengths in the presence of a high ionic strength solution. In some aspects, the methods can be used to identify a plurality of modified polymerases or biologically active fragments having enhanced thermostability as compared to a reference polymerase under the same conditions. In some embodiments, the methods can be used to identify a plurality of modified polymerases or biologically active fragments thereof with enhanced thermostability in the presence of a high ionic strength solution. In some aspects, the methods can be used to identify a plurality of modified polymerases or biologically active fragments having reduced strand bias and/or reduced systematic error as compared to a reference polymerase under the same conditions. In some embodiments, the methods can be used to identify a plurality of modified polymerases or biologically active fragments thereof having reduced strand bias and/or reduced systematic error in the presence of high ionic strength solutions. In some embodiments, the high ionic strength solution may include KCl and/or NaCl salts. In some embodiments, the high ionic strength solution may be at least 120mM salt. In some embodiments, the high ionic strength solution may be 125mM to 200mM salt. In some embodiments, the high ionic strength solution may be at a salt concentration of about 130mM, 150mM, 200mM, 225mM, 250mM, 275mM, 300mM, 350mM, 400mM, 450mM, 500mM, or greater. In some embodiments, the high ionic strength solution may be about 125mM to about 400mM salt. In some embodiments, the high ionic strength solution may be about 150mM to about 275mM salt. In some embodiments, the high ionic strength solution may be about 200mM to about 250mM salt. It will be apparent to the skilled person that various other suitable salts may be used instead of or in combination with KCl and/or NaCl. In some embodiments, the ionic strength solution may further comprise a sulfate.
As will be immediately apparent to the skilled artisan, the present invention outlines exemplary automated and high throughput methods to generate libraries of modified polymerases or biologically active fragments. The invention also outlines methods to assess the polymerase activity of such modified polymerases or biologically active fragments. It is also contemplated that the skilled artisan can readily generate a mutation-inducing library of constructs in which each amino acid within the relevant polymerase can be mutated. In some embodiments, a mutation-inducing library can be prepared in which every amino acid residue within the polymerase is mutated by every possible amino acid combination. In some embodiments, a mutation-inducing library can be prepared in which every amino acid within the polymerase is mutated, and in which the combination of possible amino acid mutations is limited to conservative or non-conservative amino acid substitutions. In both examples, a mutation-inducing library can be generated that contains a large number of mutant constructs that can be applied via automated or high-throughput systems for purification or for initial screening. In some embodiments, plates representing 96-or 384-library constructs of mutation-inducing libraries can be assessed using next generation (i.e., high throughput) platforms such as ion torrent system personal genome machines and ion-based ISFET sequencing chips (life technologies, inc., california), using ISFET-based sequencing polymerase screening. In one example, the polymerase screen may include one or more 96 or 384 culture plates representing the mutation-inducing library; wherein each well of the culture plate consists of a different construct (modified polymerase) containing at least one or more amino acid mutations compared to a reference polymerase (lacking at least one or more amino acid mutations) in at least one well on the same culture plate. In some embodiments, the reference polymerase serves as a control sample within a 96 or 384 culture plate to assess the polymerase activity of each modified polymerase within each well of the same culture plate. In some embodiments, the library of constructs and the reference polymerase within the culture plate may further comprise a unique barcode for each modified polymerase within the culture plate. Thus, a 96-well plate can contain 96 barcodes if each well in the plate contains a reference polymerase or modified polymerase construct. After purification, the mutation of the protein can be assessed to induce one or a combination of polymerase activities in the library, such as DNA binding, primer extension, strand displacement, reverse transcriptase, nick initiation polymerase activity, primary accuracy, increased total sequencing throughput, reduced strand bias, reduced systematic error, increased read length, increased thermostability with increased processivity, and the like. In some embodiments, the template library may further comprise a library of templates known to perform well under the proposed amplification conditions, such that the well-performing template library may serve as a baseline or control reading.
Optionally, the purified modified polymerase or biologically active fragment thereof can be further assessed for other properties, such as the ability to amplify or sequence nucleic acid molecules in the presence of high salt. The origin or origin of the polymerase to be mutated is not generally considered critical. For example, eukaryotic, prokaryotic, archaeal, bacterial, bacteriophage or viral polymerases may be used in the methods. In some embodiments, the polymerase can be a DNA or RNA polymerase. In some embodiments, the DNA polymerase can include a family a or a family B polymerase. In some embodiments, the DNA polymerase can include a thermostable DNA polymerase. In view of the fields of protein engineering and enzymology (enzymology), the exemplary methods provided herein should be viewed as illustrative, and should not be construed as limiting in any way.
In some embodiments, the modified polymerase or biologically active fragment thereof includes one or more amino acid mutations located within the catalytic domain of the modified polymerase. In some embodiments, the modified polymerase or biologically active fragment thereof can include at least 25, 50, 75, 100, 150, or more amino acid residues of the catalytic domain. In some embodiments, the modified polymerase or biologically active fragment thereof can include any portion of the catalytic domain that comprises at least 25, 50, 75, 100, 150, or more contiguous amino acid residues. In some embodiments, the modified polymerase or biologically active fragment thereof may include at least 25 contiguous amino acid residues of the catalytic domain and may optionally include one or more amino acid residues at the C-terminus or N-terminus outside of the catalytic domain. In some embodiments, the modified polymerase or biologically active fragment can include any 25, 50, 75, 100, 150 or more contiguous amino acid residues in the catalytic domain coupled to any one or more non-catalytic domain amino acid residues.
In some embodiments, the modified polymerase (or biologically active fragment thereof) comprises one or more amino acid mutations located within the catalytic domain of the modified polymerase, and wherein the polymerase has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more identity to any of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 or 50 contiguous amino acid residues of the catalytic domain and has at least 80% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 75 contiguous amino acid residues of the catalytic domain and has at least 85% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 or 50 contiguous amino acid residues of the catalytic domain and has at least 90% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the catalytic domain and has at least 95% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 or 50 contiguous amino acid residues of the catalytic domain and has at least 98% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 or 50 contiguous amino acid residues of the catalytic domain and has at least 99% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes one or more amino acid mutations located within the DNA binding domain of the polymerase. In some embodiments, the modified polymerase or biologically active fragment thereof can include at least 25, 50, 75, 100, 150, or more amino acid residues of the DNA binding domain of the modified polymerase. In some embodiments, the modified polymerase or biologically active fragment thereof can include any portion of the DNA binding domain that comprises at least 25, 50, 75, 100, 150, or more contiguous amino acid residues. In some embodiments, the modified polymerase or biologically active fragment thereof may include at least 25 contiguous amino acid residues of the binding domain and may optionally include one or more amino acid residues at the C-terminus or N-terminus outside the binding domain. In some embodiments, the modified polymerase or biologically active fragment can include any 25, 50, 75, 100, 150 or more contiguous amino acid residues in the binding domain coupled to any one or more non-binding domain amino acid residues. In some embodiments, the modified polymerase (or biologically active fragment thereof) comprises one or more amino acid mutations located within the DNA binding domain of the modified polymerase, and wherein the polymerase has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the DNA binding domain and has at least 80% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the DNA binding domain and has at least 85% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the DNA binding domain and has at least 90% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the DNA binding domain and has at least 95% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25 contiguous amino acid residues of the DNA binding domain and has at least 98% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 50 consecutive amino acid residues of the DNA binding domain and has at least 80%, 85%, 90%, 95%, 98%, or 99% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes one or more amino acid mutations located outside of the catalytic domain of the polymerase (also referred to herein as the DNA binding cleft). The catalytic domains of family a DNA polymerases, family B DNA polymerases, and reverse transcriptases, as well as RNA-dependent RNA polymerases, are well known; all share a common overall structure and catalytic mechanism. The catalytic domains of all of these polymerases have a shape compared to the right hand and consist of the "palm", "thumb" and "finger" domains. The palmar domain typically contains a catalytic site for phosphoryl transfer reactions. The thumb is believed to function to position the duplex DNA and to continue synthesis and translocation. The finger interacts with the introduced nucleotide and its paired template base. The palm domains are homologous in A, B and the RT family, but the arrangement of fingers and thumb is different. The thumb domains of different polymerase families share a common feature, containing parallel or antiparallel alpha-helices, where at least one of the alpha-helices interacts with the minor groove of the primer-template complex. The finger domain also preserves the alpha-helix disposed at the blunt end of the primer-template complex. This helix contains highly conserved side chains (B motifs).
Three conserved motifs A, B and C have been identified for the a family polymerase. The a and C motifs are typically conserved in both B family polymerases and RT polymerases. (Delarue et al, Protein Engineering 3:461-467 (1990)).
In some embodiments, for a family a polymerases, the a motif comprises the common sequence:
a.
in some embodiments, for a family a polymerases, the B motif comprises the common sequence:
a.
in some embodiments, for a family a polymerases, the C motif comprises the common sequence:
a.
in some embodiments, the polymerase optionally comprises any family a polymerase or biologically active fragment, mutant, variant or truncation thereof, wherein the linking moiety is attached to any amino acid residue in the family a polymerase or biologically active fragment, mutant, variant or truncation thereof that is outside of A, B or the C motif. In some embodiments, the linking moiety is linked to any amino acid residue of the family a polymerase or the biologically active fragment that is outside of the a motif, B motif, or C motif.
The a and C motifs typically form part of the palmar domain, and each motif typically contains a strictly conserved aspartate residue, which is involved in the catalytic mechanism common to all DNA polymerases. DNA synthesis can be carried out by transferring a phosphoryl group from an introduced nucleotide to the 3' OH of the DNA, releasing the polyphosphate moiety and And form new DNA phosphodiester bonds. This reaction is typically carried out by involving two metal ions (usually Mg)2+) And two conserved aspartate residues.
In some embodiments, the conserved glutamic acid residue in motif A of the A family DNA polymerase plays an important role in correct nucleotide incorporation, as does the corresponding conserved tyrosine in the B family members (Minnick et al, Proc. Natl. Acad. Sci. USA 99:1194-1199 (2002); Parsell et al, nucleic acid Res. 35:3076-3086 (2002); mutation at the conserved Leu of motif A affects replication fidelity (Venkatesan et al, J. Biochem. 281:4486-4494 (2006)).
In some embodiments, the B motif contains conserved lysine, tyrosine, and glycine residues. It has been shown that the B motif of E.coli pol I binds to the nucleotide substrate and contains a conserved tyrosine already shown in the active site.
In some embodiments, for a B family polymerase, the a motif comprises the common sequence:
in some embodiments, for a B family polymerase, the B motif comprises the common sequence:
in some embodiments, for a B family polymerase, the C motif comprises the common sequence:
a.
Residues in bold indicate invariant residues.
In some embodiments, the modified polymerase optionally comprises any B family polymerase or biologically active fragment, mutant, variant or truncation thereof, wherein the linking moiety is attached to any amino acid residue in the B family polymerase or biologically active fragment, mutant, variant or truncation thereof that is outside of A, B or the C motif. In some embodiments, the linking moiety is linked to any amino acid residue of the family B polymerase or the biologically active fragment that is outside of the a-, B-, or C-motif.
In some embodiments, the B family polymerase contains six conserved motifs, where regions I and II correspond to the a and C motifs of the a family. Region III is involved in nucleotide binding and is functionally homologous to motif B. Regions I, II and III converge at the center of the active site of bases from the palm (I), finger (II) and thumb (III) to create a continuous conserved surface. Within these regions, a group of highly conserved residues form three chemically distinct clusters consisting of exposed aromatic residues, negatively charged residues, and positively charged residues, respectively. For example, in the replicating polymerase of bacteriophage RB69, the three clusters correspond to the following amino acid residues: y416, Y567 and Y391 (exposed aromatic residues), D621, D623, D411, D684 and E686 (negatively charged residues), and K560, R482 and K486 (positively charged residues). See Wang et al, cells 89: 1087-. These three clusters typically encompass the region where the primer ends and the introduced nucleotides are expected to bind. In some embodiments, the modified polymerase optionally comprises any B family polymerase or biologically active fragment, mutant, variant or truncation thereof, wherein the linking moiety is attached to any amino acid residue in the B family polymerase or biologically active fragment, mutant, variant or truncation thereof that is outside of one or more of these conserved amino acid clusters or motifs. In some embodiments, the linking moiety is linked to any amino acid residue in the family B polymerase or biologically active fragment, mutant, variant, or truncation thereof that is outside of any of these conserved amino acid clusters or motifs.
RT polymerase contains four conserved sequence motifs (Poch et al, journal of European molecular biology (EMBO J.) 12:3867-3874(1989)), where motifs A and C contain a conserved catalytic aspartate. The integrity of motif B is also required for reverse transcriptase function.
The common sequence of motif A is DXXXXXF/Y (SEQ ID NO:41)
The common sequence for motif B is FXGXXXS/A (SEQ ID NO:42)
The common sequence of the motifs C is
The common sequence for motif D is GXXXXXXK (SEQ ID NO: 44).
Mutations in the YXDD motifs (motif C), which is the most highly conserved among these motifs, can abolish polymerase activity and alter processivity and fidelity (Sharma et al, Antiviral Chemistry and Chemotherapy 16:169-182 (2005)). In addition, conserved lysine residues in motif D, the loop unique to RT polymerase, are invariant residues critical to nucleotide binding (Canard et al, J. Biol. chem. 274:35768-35776 (1999)).
In some embodiments, the modified polymerase optionally comprises any RT polymerase or biologically active fragment, mutant, variant or truncation thereof, wherein the linking moiety is attached to any amino acid residue of the RT polymerase or biologically active fragment, mutant, variant or truncation thereof that is outside of one or more of the A, B, C and D motifs. In some embodiments, the linking moiety is attached to any amino acid residue in the RT polymerase or biologically active fragment, mutant, variant or truncation thereof that is outside of any of these motifs.
In some embodiments, the modified polymerase includes one or more modifications (including amino acid substitutions, deletions, additions, or chemical modifications) at any position other than a conserved or unchanged residue.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 25, 50, 75, or 100 consecutive amino acid residues having at least 80% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 50, 75, 100, 150, 175, 200 contiguous amino acid residues having at least 85% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 225, 250, 275, 300, 325, 350, 375, 400 consecutive amino acid residues having at least 85% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500 or more contiguous amino acid residues having at least 90% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 100, 200, 300, 400, 500, 600, 700 or more contiguous amino acid residues having at least 95% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 100 contiguous amino acid residues having at least 98% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 150 contiguous amino acid residues having at least 99% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof includes at least 200 contiguous amino acid residues having at least 99% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof includes at least 400 contiguous amino acid residues having at least 99% identity to any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33.
in some embodiments, the modified polymerase may include one or more additional functional domains in addition to the polymerase domain, including domains required for a proofreading 3'- >5' (trans) exonuclease activity that mediates newly synthesized DNA strands or a 5'- >3' (plus) exonuclease activity or FLAP endonuclease activity that mediates nick translation during DNA repair. In some embodiments, the modified polymerase has strand displacement activity and can catalyze nucleic acid synthesis by polymerizing nucleotides into the 3' end of a nick within a double-stranded nucleic acid template while displacing nucleic acid located downstream of the nick. It will be appreciated by those skilled in the art that modified polymerases as encompassed by the present invention optionally also have any one or more of these activities.
The 3 'to 5' exonuclease correction domains of both family a and B DNA polymerases contain three conserved motifs, called Exo I, Exo II and Exo III, each of which contains an invariant aspartate residue that is essential for metal binding and exonuclease function. Alterations of these conserved aspartate residues produce proteins that retain polymerase activity but have inadequate exonuclease activity (Hall et al, J.Gen.Virol.) 76:2999-3008 (1995)). Conserved motifs and amino acid changes in the 5 'to 3' exonuclease domain that affect exonuclease activity have also been identified (U.S. patent No. 5,466,591).
Representative examples of family A enzymes are E.coli Pol I or Klenow fragment of E.coli Pol I, Bst DNA polymerase, Taq DNA polymerase, T7 DNA polymerase and Tth DNA polymerase. Family A enzymes also include the Platinum Taq DNA polymerase family.
In some embodiments, the a family enzyme is characterized by high DNA elongation but may have poor fidelity due to not having 3'-5' exonuclease activity. In some embodiments, the B family enzyme may have high fidelity due to its 3'-5' exonuclease activity, but may achieve low DNA elongation rates.
Other types of polymerases include, e.g., Tbr polymerase, Tfl polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tih polymerase, Tfi polymerase, and the like. RT polymerases include HIV reverse transcriptase, Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, or Rous Sarcoma Virus (RSV) reverse transcriptase. Variants, modified products and derivatives thereof may also be used. Similarly, Taq, Platinum Taq, Tth, Tli, Pfu, Pfutubo, Pyrobest, Pwo and KOD, VENT, DEEPVENT, EX-Taq, LA-Taq, TherminatorTMExtension series and Platinum Taq Hi-Fi are all commercially available. Other enzymes can be readily isolated from a particular bacterium by one of ordinary skill in the art.
An exemplary polymerase, E.coli DNA polymerase I ("Pol I"), has three enzymatic activities: 5 'to 3' DNA polymerase activity; mediating a corrected 3 'to 5' exonuclease activity; and 5 'to 3' exonuclease activity that mediates nick translation during DNA repair. Klenow fragments are larger protein fragments which are produced when E.coli Pol I is proteolytically cleaved by subtilisin. It retains polymerase and proofreading exonuclease activity but lacks 5 'to 3' exonuclease activity. Klenow exo (exo-Klenow) fragments that have been mutated to remove the proofreading exonuclease activity are also available. The structure of the Klenow fragment shows that highly conserved residues that interact with DNA include N675, N678, K635, R631, E611, T609, R835, D827, S562 and N579(Beese et al, Science 260: 352-.
Arg682 in the Klenow fragment of E.coli DNA polymerase I (pol I) is important for the template-dependent nucleotide binding function and appears to maintain the high processivity of DNA polymerase (Pandey et al, J.European Journal of Biochemistry, 214:59-65 (1993)).
In some embodiments, the modified polymerase may be derived from Taq DNA polymerase, which is a family a DNA polymerase derived from the thermophilic bacterium thermus aquaticus. It is known to be best used in polymerase chain reactions. Taq polymerase lacks proofreading activity and therefore has relatively low replication fidelity (Kim et al, Nature 376:612-616 (2002).
In some embodiments, the polymerase may be derived from bacteriophage T7T 7 DNA polymerase, a family a DNA polymerase consisting of a 1:1 complex of viral T7 gene 5 protein (80k Da) and escherichia coli thioredoxin (12k Da). It lacks a 5'- >3' exonuclease domain, but the 3'- >5' exonuclease activity is approximately 1000 times greater than that of the escherichia coli klenow fragment. Exonuclease activity appears to be responsible for the high fidelity of this enzyme and prevents strand displacement synthesis. This polymerase typically exhibits a high level of processivity.
In some embodiments, the polymerase may be derived from KOD DNA polymerase, which is a family B DNA polymerase derived from thermus cervicalis (Thermococcus kodakaraensis). KOD polymerase is a thermostable DNA polymerase with high fidelity and processivity.
In some embodiments, the polymerase may be derived from a thermoatorTMTMA DNA polymerase, which is also a family B DNA polymerase. TherminatorTMIs an A485L point mutation in DNA polymerase from Thermococcus species 9oN-7 (Ichida et al, nucleic acids Res. 33:5214-5222 (2005)). TherminatorTMPolymerases have enhanced ability to incorporate modified substrates such as dideoxynucleotides, ribonucleotides, and acyclic nucleotides.
In some embodiments, the polymerase may be derived from a Phi29 polymerase or a Phi 29-type polymerase, such as a polymerase derived from bacteriophage B103. Phi29 and B103 DNA polymerases are family B polymerases from related bacteriophages. In addition to A, B and the C motif, the Phi29 family of DNA polymerases contains an additional conserved motif, KXY in region Y (Blanco et al, J. Biochem. 268:16763-16770(1993) mutations in Phi29 and B103 polymerases that affect polymerase activity and nucleotide binding affinity are described in U.S. patent publication No. 20110014612 and its priority U.S. provisional applications Nos. 61/307,356, 61/299,917, 61/299,919, 61/293,616, 61/293,618, 61/289,388, 61/263,974, 61/245,457, 61/242,771, 61/184,770 and 61/164,324, which are incorporated herein by reference in their entirety.
In some embodiments, the polymerase is derived from a reverse transcriptase from human immunodeficiency virus type 1 (HIV-1), which is a heterodimer consisting of one 66kDa subunit and one 51kDa subunit. The p66 subunit contains both polymerase and rnase H domains; proteolytic cleavage of p66 removes the RNase H domain to generate the p51 subunit (Wang et al, PNAS 91:7242-7246 (1994)). The structure of HIV-1 reverse transcriptase shows a variety of interactions between the 2' -OH group of the RNA template and the reverse transcriptase. Residues Ser280 and Arg284 of helix I in the thumb of p66 are involved in RNA-RT interactions, as are residues Glu89 and Gln91 of the template clamp in the palm of p 66. The p51 subunit also plays a role in The interaction between RNA-DNA duplexes and RT, where residues Lys395, Glu396, Lys22 and Lys390 of The p51 subunit also interact with DNA: RNA duplexes (Kohlstaedt et al, science 256:1783-1790(1992) and Safarianos et al, European Journal of molecular biology (The EMBO Journal) 20:1449-1461 (2001)).
In some embodiments, the polymerase is derived from Bst DNA polymerase from Bacillus stearothermophilus or any biologically active fragment thereof. Bst polymerase can be a family A DNA polymerase. The large fragment of naturally occurring Bst DNA polymerase is equivalent to the Klenow fragment of E.coli Pol I, retaining the polymerase and proofreading exonuclease activity, while lacking 5 'to 3' exonuclease activity. In some embodiments, polymerases derived from Bst DNA polymerase may lack 3 'to 5' exonuclease activity. As used herein, the term "Bst DNA polymerase" can refer to either a full-length protein or a large fragment of Bst.
In some embodiments, the modified polymerase consists of or comprises an isolated polymerase variant having or comprising an amino acid sequence at least 80% identical to the amino acid sequence of wild-type full-length or wild-type large fragment Bst DNA polymerase. In some embodiments, the modified polymerase is an isolated Bst DNA polymerase variant comprising a variant having an amino acid sequence at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the amino acid sequence of a wild-type Bst or large fragment Bst DNA polymerase. In some embodiments, the modified Bst polymerase includes one or more amino acid modifications (e.g., amino acid substitutions, deletions, additions, or chemical modifications) relative to the Bst polymerase (which corresponds to the reference polymerase), e.g., wild-type Bst DNA polymerase.
In some embodiments, the modified polymerase consists of or comprises an isolated Bst DNA polymerase variant having or comprising a wild-type full length Bst DNA polymerase amino acid sequence and further comprising one or more of the following amino acid substitutions: his46Arg (H46R), Glu446Gln (E446Q), and His572Arg (H572R), where the numbering is relative to the wild-type amino acid sequence of Bst DNA polymerase.
In some embodiments, the modified polymerase consists of or comprises an isolated polymerase variant having or comprising an amino acid sequence at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the amino acid sequence of wild-type full length Bst DNA polymerase and further comprising one or more of each of the following amino acid substitutions: his46Arg (H46R), Glu446Gln (E446Q), and His572Arg (H572R), where the numbering is relative to the wild-type full-length amino acid sequence of Bst DNA polymerase. In some embodiments, the modified polymerase includes one or more amino acid modifications (e.g., amino acid substitutions, deletions, additions, or chemical modifications) relative to a reference polymerase (e.g., a polymerase lacking one or more amino acid modifications).
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 80% identity to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 90% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 95% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 98% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 99% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 90% identity to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 90% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 95% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 98% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 99% identity to the sequence: 1 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 1, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase consists of or comprises an isolated polymerase variant having or comprising an amino acid sequence at least 80% identical to the amino acid sequence of seq id no:2, SEQ ID NO. In some embodiments, the polymerase is a variant of Taq DNA polymerase comprising an amino acid sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to: 2, SEQ ID NO. In some embodiments, the reference polymerase is a Taq DNA polymerase consisting of the amino acid sequence: 2, and the modified polymerase includes one or more amino acid modifications (e.g., amino acid substitutions, deletions, additions, or chemical modifications) relative to a reference polymerase. In some embodiments, the reference polymerase, the modified polymerase, or both the reference and modified polymerases include a deletion or substitution of the methionine residue at position 1, wherein numbering is relative to the amino acid sequence: 2, SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 2 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 2, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase may comprise an amino acid sequence having or comprising the amino acid sequence: 3, SEQ ID NO. In some embodiments, the modified polymerase can include the amino acid sequence of any biologically active fragment of the polymerase having or comprising an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the amino acid sequence of seq id no:3, SEQ ID NO. In some embodiments, the reference polymerase is a Taq DNA polymerase consisting of the amino acid sequence: 3, and the modified polymerase includes one or more amino acid modifications (e.g., amino acid substitutions, deletions, additions, or chemical modifications) relative to a reference polymerase. In some embodiments, the reference polymerase, the modified polymerase, or both the reference and modified polymerases include a deletion or substitution of the methionine residue at position 1, wherein numbering is relative to the amino acid sequence: 3, SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 3 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 3, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase consists of or comprises an isolated polymerase variant having or comprising an amino acid sequence at least 80% identical to the amino acid sequence of seq id no: SEQ ID NO 4. In some embodiments, the polymerase is a variant of Taq DNA polymerase comprising an amino acid sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to: SEQ ID NO 4. In some embodiments, the reference polymerase is a Taq DNA polymerase consisting of the amino acid sequence: 4, and the modified polymerase includes one or more amino acid modifications (e.g., amino acid substitutions, deletions, additions, or chemical modifications) relative to a reference polymerase. In some embodiments, the reference polymerase, the modified polymerase, or both the reference and modified polymerases include a deletion or substitution of the methionine residue at position 1, wherein numbering is relative to the amino acid sequence: SEQ ID NO 4.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved thermostability as compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 90% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 95% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues having at least 98% identity to the sequence: 4 and wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 150 contiguous amino acid residues having at least 99% identity to the sequence: 4, wherein the modified polymerase or biologically active fragment thereof has improved accuracy compared to the sequence: SEQ ID NO: 34.
In some embodiments, the present invention relates generally to a modified polymerase including an isolated Taq DNA polymerase variant comprising an amino acid sequence selected from the group consisting of: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the present invention relates generally to a modified polymerase including an isolated Taq DNA polymerase variant comprising an amino acid sequence selected from the group consisting of: SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprises one or more non-naturally occurring amino acid substitutions. Optionally, the modified polymerase includes one, two, three, four, five or more amino acid substitutions relative to the amino acid sequence: SEQ ID NO 1 or 34.
In some embodiments, the reference polymerase may comprise a Taq DNA polymerase having or comprising the following amino acid sequence: 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33; wherein the modified polymerase comprises a variant of a reference polymerase, the modified polymerase thereby further comprising one, two, three, four, five or more amino acid substitutions relative to the reference polymerase. In some embodiments, the modified polymerase comprises or consists of an amino acid sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the amino acid sequence of a reference polymerase, but typically less than 100% identical relative to the amino acid sequence. In some embodiments, the one, two, three, four, five, or more amino acid substitutions relative to a reference polymerase can include at least one conservative amino acid substitution.
In some embodiments, a modified polymerase or biologically active fragment thereof (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved thermostability and/or improved accuracy relative to a reference polymerase comprises or consists of at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the modified polymerase or biologically active fragment thereof further comprises at least 25 contiguous amino acids of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises at least 50 contiguous amino acid residues of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues of the polymerase DNA binding domain. In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 100 contiguous amino acid residues of the polymerase DNA binding domain, while also having at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the modified polymerase or biologically active fragment thereof comprises or consists of at least 200 contiguous amino acid residues of the polymerase DNA binding domain, while also having at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the present invention relates generally to a composition comprising an isolated polypeptide having at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to a composition comprising an isolated nucleic acid having at least 80% identity to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: p6, a77, a97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, a502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, where the numbering is specific to the amino acid residues: 1 in SEQ ID NO.
In some embodiments, the present invention relates generally to a composition comprising an isolated nucleic acid having at least 80% identity to: 1, and further comprises at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific to the numbering of the amino acid residues: 1 in SEQ ID NO.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 1 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 2 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 3 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 4, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 5 and having one or more amino acid mutations selected from the group consisting of: a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 6 and having one or more amino acid mutations selected from the group consisting of: P6N, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 7 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 8, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 9 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 10 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 11, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 12 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 13, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 14, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 15, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 16, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 17, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 18, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 19, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 20, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 21 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 22, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 23, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 24, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 25 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 26, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, V737A, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 27, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, E745T, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 28, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, L763F, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 29, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, E790G, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 30 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E794C, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 31 and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E805I and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 32, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C and L828A.
In some embodiments, the invention relates generally to an isolated and purified polypeptide comprising, or consisting of, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to: 33, and having one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C and E805I.
In some embodiments, the composition comprises at least 80% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: p6, a77, a97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, a502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, where the numbering is specific to the amino acid residues: 1 in SEQ ID NO. In some embodiments, the amino acid substitution comprises a conservative amino acid substitution.
In some embodiments, the composition comprises at least 80% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific to the numbering of the amino acid residues: 1 in SEQ ID NO.
In some embodiments, the composition comprises at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identity to the following sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising at least one amino acid substitution selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is specific to the numbering of the amino acid residues: 1 in SEQ ID NO.
In some embodiments, the modified polymerase may include any one or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A wherein the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase has improved accuracy and/or improved thermostability relative to a reference polymerase. Without being bound by any particular theory of operation, it can be observed that in some embodiments, one or more of the above substitutions can alter, e.g., increase or decrease, the accuracy or thermostability of the modified polymerase relative to a reference (e.g., unmodified) polymerase. In some embodiments, such increases in accuracy and/or thermal stability may be observed in the form of increases in the signal generated in ion-based sequencing reactions.
In some embodiments, the reference polymerase, the modified polymerase, or both the reference and modified polymerases can further comprise a deletion of the methionine residue at position 1 or a substitution of the methionine residue at position 1 with any other amino acid residue, wherein numbering is relative to the amino acid sequence: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, the invention relates generally to an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
in some embodiments, the invention relates generally to a composition comprising an isolated nucleic acid sequence comprising or consisting of a nucleic acid sequence encoding a polypeptide having at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO:33, and further comprising one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E193 805I and L828A.
In some embodiments, the present invention relates generally to a vector comprising an isolated nucleic sequence encoding a polypeptide or biologically active fragment thereof selected from the group consisting of: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33. in some embodiments, a vector comprising an isolated nucleic acid sequence encoding a polypeptide or biologically active fragment thereof comprises a DNA polymerase. In some embodiments, the DNA polymerase is thermus aquaticus (Taq) polymerase. In some embodiments, the DNA polymerase is a thermostable DNA polymerase. In some embodiments, the DNA polymerase is derived from a thermostable thermus aquaticus (Taq) polymerase.
In some embodiments, the present invention relates generally to a vector comprising an isolated nucleic acid sequence encoding a polypeptide or biologically active fragment thereof, said polypeptide or biologically active fragment thereof comprising a homolog of Taq DNA polymerase, wherein said homolog of Taq DNA polymerase comprises at least one amino acid substitution corresponding to an amino acid substitution present in any one of the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to a kit comprising isolated polypeptides having at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the kit comprises isolated polypeptides having at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the kit comprises an isolated polypeptide comprising or consisting of at least 250, at least 300, at least 350, at least 400, at least 450, at least 500, at least 550, at least 600, or at least 650 contiguous amino acid residues having at least 90% identity to seq id no: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the kit further comprises one or more suitable buffers, MgCl, and dntps.
In some embodiments, the present invention generally relates to a system (and related devices, kits, methods, and compositions) for amplifying one or more nucleic acids. In some embodiments, the system may comprise a DNA polymerase having at least one mutation (e.g., substitution, insertion, deletion, fusion, etc.) compared to the amino acid sequence of: 1 or 34; a solid support comprising a nucleic acid molecule to be amplified; a mixture of nucleotides (e.g., dNTPs, ddNTPs, etc.); and conditions for amplifying the nucleic acid molecule on the solid support. In some embodiments, the amplification may comprise clonal amplification or bridge PCR amplification. In some embodiments, amplification may include proximity ligation amplification, rolling circle amplification, PCR amplification, isothermal amplification, recombinase polymerase amplification, strand displacement amplification, emulsion PCR amplification, and the like. In illustrative embodiments, the DNA polymerase is a modified polymerase including any of the following mutations: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is relative to the amino acid sequence: 1 in SEQ ID NO.
In some embodiments, the invention generally relates to a polymerase or biologically active fragment thereof having DNA polymerase activity and at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the following sequence: SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 and SEQ ID NO 33, wherein the polymerase or biologically active fragment having DNA polymerase activity comprises at least one sequence which is complementary to at least one of the following sequences Amino acid substitutions of ratios: SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, the polymerase or biologically active fragment thereof comprises at least two, three, four, five or more amino acid substitutions as compared to the sequence of seq id no: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, at least one amino acid substitution compared to the sequence: SEQ ID NO 1 or SEQ ID NO 34 may confer a beneficial property to the polymerase or biologically active fragment thereof. In some embodiments, the beneficial properties conferred to the polymerase or biologically active fragment thereof (as compared to SEQ ID NO:1 or SEQ ID NO: 34) include improved thermostability, improved read length, improved templating efficiency, improved performance in high ionic strength solutions, or improved accuracy. In some embodiments, the beneficial properties conferred to the polymerase or biologically active fragment thereof (as compared to SEQ ID NO:1 or SEQ ID NO: 34) include reduced strand bias of the GC-and AT-rich nucleic acid. It will be generally understood by those of ordinary skill in the art that the beneficial properties imparted to a polymerase or biological fragment (as compared to the properties of SEQ ID NO:1 or SEQ ID NO: 34) can be determined by assessing and/or measuring such beneficial properties by any suitable means under the same conditions (e.g., comparing the properties of SEQ ID NO:1 to the polymerase or biologically active fragment thereof under the same conditions). For example, the accuracy of a DNA polymerase can be measured with respect to the longest perfect read obtained from a nucleotide polymerization reaction (typically measured with respect to the number of nucleotides correctly included in the read). In some embodiments, the nucleotide polymerization reaction may be performed using emulsion PCR, bridge PCR, or hot start PCR conditions. In some embodiments, one or more of the beneficial properties conferred to the polymerase or biologically active fragment thereof can be determined by assessing sequencing accuracy. In some embodiments, sequencing accuracy can be determined using any next generation (i.e., massively parallel, high-throughput) sequencing platform (e.g., ion torrent system, Illumina HiSeq or True Seq or X-10 system). In some embodiments, sequencing accuracy can be determined using any ISFET-based sequencing system. However, it will be apparent that other suitable methods of determining improved thermal stability and/or improved accuracy may be used and are encompassed within the scope of the present invention.
In some embodiments, the invention generally relates to a substantially purified polymerase comprising or consisting of an amino acid sequence that is a biologically active fragment of: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the polymerase activity, characteristic, or property is selected from primer extension activity, strand displacement activity, proofreading activity, nick initiation polymerase activity, reverse transcriptase activity, precision, average read length, thermostability, processivity, strand bias, or nucleotide polymerization activity. In some embodiments, the polymerase activity, characteristic, or property is selected from one or more sequencing-based metrics selected from raw read accuracy, average read length, thermostability, or processivity.
In some embodiments, the invention generally relates to a substantially purified polymerase having an amino acid sequence comprising or consisting of a biologically active fragment of the following sequence having polymerase activity: SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33, the polymerase activity selected from the group consisting of reading improved compared to the polymerase activity of the following sequences under the same conditions Length, improved accuracy or improved thermal stability: SEQ ID NO 1 or SEQ ID NO 34. In some embodiments, polymerase activity is determined in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution is at least 120mM Kcl. In some embodiments, the high ionic strength solution is 125mM KCl to 200mM KCl.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E397 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E397V amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1, and further comprises an L763 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an L763F amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1, and further comprises an E805 amino acid substitution, wherein the numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E805I amino acid substitution, wherein the numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E745 amino acid substitution, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises an E745T amino acid substitution, wherein numbering is relative to SEQ ID No. 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E397 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E397V amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an L763 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an L763F amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E805 amino acid substitution, wherein the numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E805I amino acid substitution, wherein the numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E745 amino acid substitution, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises an E745T amino acid substitution, wherein numbering is relative to SEQ ID No. 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises E397, E745 and L763 amino acid substitutions, wherein numbering is relative to SEQ ID NO: 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises E397V, E745T and L763F amino acid substitutions, wherein numbering is relative to SEQ ID NO 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34 and further comprises E397, E745 and L763 amino acid substitutions, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34 and further comprises E397V, E745T and L763F amino acid substitutions, wherein numbering is relative to SEQ ID NO 34.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1, and further comprises E805 and L763 amino acid substitutions, wherein numbering is relative to SEQ ID No. 1. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 1 and further comprises E805I and L763F amino acid substitutions, wherein numbering is relative to SEQ ID NO: 1.
In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises E805 and L763 amino acid substitutions, wherein numbering is relative to SEQ ID No. 34. In some embodiments, the present invention relates generally to a substantially purified polymerase comprising or consisting of an amino acid sequence that is at least 90% identical to: 34, and further comprises E805I and L763F amino acid substitutions, wherein numbering is relative to SEQ ID NO: 34.
In some embodiments, the reference polymerase has or comprises the following amino acid sequence: 2, 3 or 4 and the modified polymerase has or comprises the amino acid sequence of a reference polymerase further comprising one or more amino acid mutations compared to the reference polymerase. In some embodiments, the amino acid mutation comprises a substitution of an existing amino acid residue at the specified position with any other amino acid residue (including naturally occurring and non-natural amino acid residues). In some embodiments, the amino acid substitution is a conservative substitution; alternatively, the amino acid substitution may be a non-conservative substitution. In some embodiments, the reference polymerase, the modified polymerase, or both the reference and modified polymerases can further comprise a deletion of the methionine residue at position 1 or a substitution of the methionine residue at position 1 with any other amino acid residue, wherein numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase exhibits a change in any one or more parameters selected from the group consisting of: average read length, accuracy, total sequencing throughput, strand bias, reduced systematic error, enhanced polymerase performance in high ionic strength solutions, improved processivity, improved performance in PCR, performance in emulsion PCR. Optionally, the change in any one or more parameters is observed by comparing the performance of the reference polymerase and the modified polymerase in an ion-based sequencing reaction.
Without being bound by any particular theory of operation, it can be observed that in some embodiments, a modified polymerase including one or more of the disclosed amino acid substitutions exhibits altered (e.g., increased) processivity relative to an unmodified polymerase or altered (e.g., decreased) strand bias relative to an unmodified polymerase. In some embodiments, the modified polymerase exhibits altered (e.g., increased) precision relative to the unmodified polymerase. In some embodiments, the modified polymerase exhibits an altered (e.g., increased) average error-free read length or an altered (e.g., increased) observed 100Q17 or 200Q17 relative to a reference polymerase. In some embodiments, the modified polymerase has polymerase activity. In some embodiments, the modified polymerase or biologically active fragment can have in vivo or in vitro primer extension activity.
In some embodiments, the one or more mutations in the modified polymerase can include at least one amino acid substitution. At least one amino acid substitution may optionally occur at any one or more positions selected from the group consisting of: p6, a77, a97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, a502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805, and L828, wherein the numbering is relative to the amino acid residues: 1 in SEQ ID NO. In some embodiments, the modified polymerase includes at least two, three, four, five, or more amino acid substitutions that occur at positions selected from this group. In some embodiments, at least one amino acid substitution may optionally occur at any one or more positions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified polymerase includes at least two, three, four, five, or more amino acid substitutions that occur at positions selected from this group.
Without being bound by any particular theory of operation, it can be observed that, in some embodiments, a modified polymerase including any of such amino acid substitutions exhibits altered (e.g., increased or decreased) thermostability relative to the unmodified polymerase or altered (e.g., increased or decreased) accuracy relative to a corresponding unmodified polymerase or relative to a reference polymerase. It will be apparent to those of ordinary skill in the art that some of the amino acid residues of the modified polymerase may be highly conserved amino acid residues. It is contemplated that one of ordinary skill in the art can construct, express, and determine which amino acid residues, if any, in a given polymerase are highly conserved by well-known means (see, e.g., U.S. Pat. No. 5,436,149; U.S. Pat. No. 6,395,524; U.S. Pat. No. 6,982,144; U.S. Pat. No. 7,312,059, and U.S. Pat. No. 8,420,325, all of which are incorporated herein in their entirety).
In some embodiments, the modified polymerase can include Taq DNA polymerase. In some embodiments, the polymerase can include Taq DNA polymerase commercially available as Platinum Taq high fidelity DNA polymerase (life technologies, california) that includes one or more amino acid mutations compared to a reference polymerase. In some embodiments, the modified polymerase can include a Taq DNA polymerase having or comprising the following amino acid sequence: 1, the amino acid sequence is that of wild Taq DNA polymerase.
In some embodiments, the modified polymerase includes a mutant or variant form of Taq DNA polymerase that retains a detectable level of polymerase activity. In order to retain the polymerase activity of Taq DNA polymerase, any substitution, deletion or chemical modification of a non-highly conserved amino acid residue, such as an invariant aspartate residue required for polymerase activity, will be performed. In some embodiments, the modified polymerase can include Taq DNA polymerase, hot start Taq DNA polymerase, chemical hot start Taq DNA polymerase, Platinium Taq DNA polymerase, and the like.
In some embodiments, the modified polymerase can include an isolated polymerase variant having or comprising an amino acid sequence at least 90% identical to the amino acid sequence of seq id no:2, SEQ ID NO. In some embodiments, the polymerase is a Taq DNA polymerase variant comprising the following amino acid sequence: 4, wherein the variant comprises an amino acid sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to: 2, SEQ ID NO.
In some embodiments, the modified polymerase includes a mutant or variant form of Taq DNA polymerase having the amino acid mutation E397V, where the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation L763F, wherein numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation E805I, wherein the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation E745T, wherein numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation a97V, where numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation E295F, wherein the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, the modified Taq DNA polymerase may include the amino acid mutation P6N, wherein the numbering is relative to the amino acid sequence: 1 in SEQ ID NO. In some embodiments, a modified polymerase including one or more of the above mutations exhibits altered (e.g., increased or decreased) accuracy relative to a corresponding reference polymerase (e.g., an unmodified polymerase SEQ ID NO: 1). In some embodiments, the modified Taq polymerase has altered (e.g., increased or decreased) thermostability relative to a reference polymerase (e.g., the unmodified polymerase SEQ ID NO: 1). In some embodiments, the modified Taq polymerase exhibits an altered (e.g., increased or decreased) read length, or an altered (e.g., increased or decreased) strand bias, or an altered (e.g., increased or decreased) processivity, or an altered systematic error (e.g., increased or decreased), or an altered (e.g., increased or decreased) 100Q17 or 200Q17 observation, or an altered (e.g., increased or decreased) AQ17 or AQ20 value, relative to a reference polymerase.
In some embodiments, the modified Taq polymerase exhibits a change in any one or more of the following parameters relative to a reference polymerase: average read length, performance in high ionic strength solutions, improved processivity, improved templating efficiency, improved thermal stability, improved performance in emulsion PCR, reduced strand bias in GC-or AT-rich sequences, or reduced systematic error. In one embodiment, the change in one or more parameters is observed by comparing the performance of a reference polymerase and a modified polymerase under the same conditions. Optionally, changes in one or more parameters can be observed using an ion-based sequencing reaction.
In some embodiments, the modified polymerase can include at least one amino acid substitution of an existing amino acid residue at the specified position with any other amino acid residue, including naturally occurring and non-natural amino acid residues. In some embodiments, the amino acid substitution is a conservative substitution; alternatively, the amino acid substitution may be a non-conservative substitution. In some embodiments, the reference polymerase, the modified Taq polymerase, or both the reference and modified Taq polymerases can further comprise a deletion of the methionine residue at position 1 or a substitution of the methionine residue at position 1 with any other amino acid residue, wherein numbering is relative to the amino acid sequence: SEQ ID NO 1
As the skilled artisan will readily appreciate, the scope of the present invention encompasses not only the specific amino acid and/or nucleotide sequences disclosed herein, but also a variety of related sequences that encode, for example, genes and/or peptides having the functional properties described herein. For example, the scope and spirit of the present invention encompasses any nucleotide and amino acid sequence encoding a conservative variant of the various polymerases disclosed herein. It will also be immediately apparent to the skilled artisan that the modified polymerases disclosed herein for amino acid sequences can be converted to the corresponding nucleotide sequences without undue experimentation, for example using a number of freely available sequence conversion applications (e.g., "in-silico").
It is expected that one of skill in the art, having identified one or more amino acid substitutions disclosed herein that confer beneficial properties to a modified polymerase (such as improved thermostability, improved accuracy, improved processivity, improved read length compared to a reference polymerase), can transfer to a different polymerase species or polymerase family without undue experimentation. Thus, after identifying amino acid mutations in a polymerase that provide altered catalytic or kinetic properties, the amino acid mutations can be screened using methods known to those of ordinary skill in the art (e.g., amino acid or nucleotide sequence alignments) to determine whether the amino acid mutations can be readily transferred to a different polymerase, e.g., a different species. In some embodiments, transferable (or homologous) amino acid mutations can include amino acid mutations that enhance properties such as increased read length, increased primary accuracy, reduced strand bias, reduced systematic error, increased total sequencing throughput, increased error-free read length, increased processivity, increased AQ values, and the like. In some embodiments, a transferable (or homologous) amino acid mutation can include transferring one or more amino acid mutations into another polymerase within or between a DNA polymerase family (e.g., DNA polymerase family a or DNA polymerase family B). In some embodiments, a transferable (or homologous) amino acid mutation can include transferring one or more amino acid mutations into one or more polymerases within or between DNA polymerase families, such as DNA polymerases across bacteria, viruses, archaea, eukaryotes, or bacteriophages.
In some embodiments, a modified polymerase according to the present invention may include a polymerase (homolog) having one or more amino acid mutations (e.g., substitutions, insertions, or deletions) homologous to one or more of the amino acid mutations disclosed herein. For example, the present invention includes within its scope a modified polymerase having one or more amino acid mutations homologous to any one of the amino acid mutations provided in the following sequences: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 and SEQ ID NO: 33. in some embodiments, the modified polymerases according to the present invention may include any of the following polymerases: the polymerases have one or more amino acid mutations homologous to one or more of the amino acid mutations provided herein for Taq DNA polymerase (e.g., one or more homologous amino acid mutations corresponding to one or more of SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:24, SEQ ID NO:2, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33). A method for determining whether a polymerase is a homologue of one or more of the modified polymerases disclosed herein includes comparing amino acid or nucleic acid sequence alignments of the modified polymerases against a "test" polymerase. For example, the National Center for Biotechnology Information (NCBI) provides a variety of electronic databases (e.g., "homology genes" and "Protein Clusters") that allow a user to determine whether an amino acid sequence exists as a homolog in another organism.
In some embodiments, a modified polymerase or biologically active fragment of a polymerase according to the invention can include a polymerase having one or more amino acid mutations homologous to one or more amino acid mutations of Taq DNA polymerase, including any one or more amino acid mutations selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, wherein the numbering is relative to the amino acid sequence: 1 in SEQ ID NO.
In some embodiments, the modified polymerase or biologically active fragment thereof includes two, three, four, five or more amino acid mutations homologous to any two, three, four, five or more amino acid substitutions selected from the group consisting of: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, where the numbering is relative to the amino acid sequence: 1 in SEQ ID NO.
In some embodiments, a modified polymerase or any biologically active fragment of a polymerase having one or more amino acid mutations homologous to the amino acid mutations disclosed herein for Taq DNA polymerase may optionally include at least one amino acid substitution designed to replace a non-cysteine amino acid residue with a cysteine residue. The skilled person will be readily able to determine a nucleotide sequence encoding any of the amino acid sequences of the present invention based on the known correspondence between the nucleotide sequence and the corresponding protein sequence.
In some embodiments, the modified polymerase or any biologically active fragment of the polymerase may include one or more biotin moieties. As used herein, the terms "biotin" and "biotin moiety", and variations thereof, comprise biotin (cis-hexahydro-2-oxo-1H-thieno [3,4 ]]Imidazole-4-pentanoic acid) and any derivatives and analogs thereof, including biotin-like compounds. Such compounds include, for example, biotin-e-N-lysine, biocytin hydrazide, amino or sulfhydryl derivatives of 2-iminobiotin and biotinyl-epsilon-aminocaproic acid-N-hydroxysuccinimide ester, sulfosuccinimidyl biotin, biotin bromoacetyl hydrazide, diazobenzoyl biocytin, 3- (N-maleimidopropanoyl) biocytin, and the like, as well as any biotin variant that can specifically bind to the avidin moiety. As used herein, the terms "avidin" and "avidin moiety" and variations thereof include the native ovalbumin glycoprotein avidin, as well as any derivatives, analogs, and other non-native forms of avidin that can specifically bind to a biotin moiety. In some embodiments, the avidin moiety may comprise a deglycosylated form of avidin, bacterial streptavidin produced by a strain of a selected Streptomyces species (e.g., Streptomyces avidinii) to truncated streptavidin, and to recombinant avidin and streptavidin and derivatives of natural, deglycosylated and recombinant avidin and derivatives of natural, recombinant and truncated streptavidin Organisms, such as N-acyl avidin (e.g. N-acetyl, N-phthaloyl and N-succinyl avidin) and commercial productsAnd NeutraliteAll forms of avidin-type molecules, including natural and recombinant avidin and streptavidin, as well as derivative molecules (e.g., unglycosylated avidin, N-acyl avidin, and truncated streptavidin), are encompassed by the terms "avidin" and "avidin moiety". Typically, but not necessarily, avidin exists in the form of a tetrameric protein, wherein each of the four tetramers is capable of binding at least one biotin moiety. As used herein, the term "biotin-avidin linkage" and variations thereof refers to the specific linkage formed between a biotin moiety and an avidin moiety. Typically, the biotin moiety can bind to the avidin moiety with high affinity, with its dissociation constant KdTypically about 10-14To 10-15mol/L. Typically, such binding occurs via non-covalent interactions
In some embodiments, the modified polymerase or any biologically active fragment of the polymerase may include one or more modified or substituted amino acids relative to the unmodified or reference polymerase, and may further include a biotin moiety attached to at least one of the one or more modified or substituted amino acids. The biotin moiety may be attached to the modified polymerase using any suitable attachment method. In some embodiments, the modified polymerase includes one or more cysteine substitution substitutions and the linking moiety includes a biotin moiety linked to at least one of the one or more cysteine substitution substitutions. In some embodiments, the modified polymerase can be chemically modified to be reversibly inactivated such that it is activated with heat (see, e.g., U.S.5,677,152, Birch et al). In these embodiments, the modified polymerase is more suitable for hot-start amplification methods, such as hot-start PCR methods.
In some embodiments, the modified polymerase is a biotinylated polymerase. As used herein, the term "biotinylation" and variations thereof refers to any covalent or non-covalent adduct of biotin with other moieties such as biomolecules, e.g., proteins, nucleic acids (including DNA, RNA, DNA/RNA chimeric molecules, nucleic acid analogs, and peptide nucleic acids); proteins (including enzymes, peptides, and antibodies); a carbohydrate; lipids, and the like.
In some embodiments, the present invention also generally relates to compositions (and related methods, kits, systems, and devices) comprising a modified polymerase that includes at least one amino acid modification relative to a reference polymerase, wherein the modified polymerase has improved processivity, improved thermostability, and/or improved accuracy relative to the reference polymerase.
In some embodiments, the present invention relates generally to a method for incorporating at least one nucleotide into a primer, comprising: contacting a nucleic acid complex comprising a template nucleic acid with a primer and a modified polymerase in the presence of one or more nucleotides, and incorporating at least one of the one or more nucleotides into the primer in a template-dependent manner using the modified polymerase.
Methods for nucleotide incorporation are well known in the art and typically comprise the use of a polymerase reaction mixture in which a polymerase is contacted with a template nucleic acid under nucleotide incorporation conditions. When a nucleotide incorporation reaction involves polymerizing a nucleotide onto a primer end, the process is typically referred to as "primer extension". Typically, but not necessarily, such nucleotide incorporation occurs in a template-dependent manner. Primer extension and other nucleotide incorporation assays are typically performed by contacting a template nucleic acid in the presence of nucleotides with a polymerase in an aqueous solution under nucleotide incorporation conditions. In some casesIn an embodiment, the nucleotide incorporation reaction may include a primer, which may optionally hybridize to a template to form a primer-template duplex. Typical nucleotide incorporation conditions are achieved after the template, polymerase, nucleotides, and optionally primers are mixed with one another in a suitable aqueous formulation to form a nucleotide incorporation reaction mixture (or primer extension mixture). The aqueous formulation may optionally include divalent cations and/or salts, especially Mg++And/or Ca++Ions. The aqueous formulation may optionally include divalent anions and/or salts, especially SO 4 2-. Typical nucleotide incorporation conditions have included time, temperature, pH, reagents, buffers, reagents, salts, cofactors, nucleotides, target DNA, primer DNA, enzymes (such as nucleic acid dependent polymerases), amounts, and/or ratios of components in the reaction among other well known parameters. The reagent or buffer may include a monovalent ion source such as KCl, potassium acetate, ammonium acetate, potassium glutamate, NH4Cl or ammonium sulfate. The reagent or buffer may include a source of divalent ions, such as Mg2+And/or Mn2+、MgCl2Or magnesium acetate. In some embodiments, the reagent or buffer may include a detergent source, such as Triton and/or Tween. Most polymerases exhibit some level of nucleotide incorporation activity at a pH range of about 5.0 to about 9.5, more typically between about pH 7 and about pH 9, sometimes between about pH 6 to about pH 8, and sometimes between pH 7 and 8. In some embodiments, the nucleotide polymerization buffer may include a chelating agent, such as EDTA and/or EGTA, among others. Although in some embodiments, the nucleotide incorporation reaction may include a buffer, such as Tris, Tricine (Tricine), HEPES, MOPS, ACES, or MES, which may provide a pH range of about 5.0 to about 9.5, such buffers may optionally be reduced or omitted when performing ion-based reactions requiring detection of ionic byproducts. In some embodiments, the nucleotide incorporation reaction can include trehalose. Methods of performing nucleic acid synthesis are well known and are well practiced in the art and references are readily available that teach a wide range of nucleic acid synthesis techniques. For nucleic acid synthesis (including, e.g., template-dependent nucleotide incorporation and primer extension) Extension method) may be found, for example, in Kim et al, Nature 376: 612-; ichida et al, nucleic acid research 33:5214-5222 (2005); pandey et al, J.European biochem 214:59-65 (1993); blanco et al, J. Biochem.268: 16763-; U.S. patent application No. 12/002781, now published as U.S. patent publication No. 2009/0026082; U.S. patent application No. 12/474897, now published as U.S. patent publication No. 2010/0137143; and U.S. patent application No. 12/492844, now published as U.S. patent publication No. 2010/0282617; U.S. patent application No. 12/748359, now published as U.S. patent publication No. 20110014612. Suitable reaction conditions for nucleotide incorporation using the modified polymerases of the present invention will be immediately apparent to the skilled artisan in view of the extensive teachings of primer extension and other nucleotide incorporation reactions in the art. In some embodiments, methods (and related kits, devices, systems, and compositions) may include the incorporation of one or more nucleotide analogs and/or reversible terminators.
In some embodiments, the present invention generally relates to reagents (e.g., buffer compositions) and kits suitable for use in nucleotide polymerization reactions using polymerases comprising any of the exemplary modified polymerases described herein. Nucleotide polymerization reactions can include, but are not limited to, nucleotide incorporation reactions (including both template-dependent and template-independent nucleotide incorporation reactions) and primer extension reactions. In some embodiments, the buffer composition may include any one or more of the following: monovalent metal salts, divalent anions, and detergents. For example, the buffer composition may include a potassium or sodium salt. In some embodiments, the buffer composition may include a manganese or magnesium salt. In some embodiments, the buffer composition may include a sulfate salt, such as potassium sulfate and/or magnesium sulfate. In some embodiments, the buffer composition may include a detergent. In some embodiments, the buffer composition may comprise a detergent selected from the group consisting of Triton and Tween. In some embodiments, the buffer may include reagents for a hot start amplification step, such as an oligonucleotide or an aptamer.
In some embodiments, the buffer composition may include at least one potassium salt, at least one manganese salt, and Triton X-100(Pierce Biochemicals). The salt may optionally include a chloride salt or a sulfate salt. In some embodiments, the buffer composition may comprise a pH of about 7.3 to about 8.0. In some embodiments, the buffer composition can include a pH of about 7.4 to about 7.9. In some embodiments, the buffer composition comprises a potassium salt (divalent dependent) at a concentration of between 5-250mM, 50-225mM, 125-200mM
In some embodiments, the buffer composition comprises a magnesium or manganese salt at a concentration between 1mM and 20 mM. In some embodiments, the buffer composition comprises a magnesium or manganese salt at a concentration between 6-15 mM.
In some embodiments, the buffer composition comprises sulfate at a concentration between 1mM and 100 mM. In some embodiments, the buffer composition comprises sulfate at a concentration between 5-50 mM.
In some embodiments, the buffer composition includes a detergent (e.g., Triton X-100 or Tween-20) at a concentration of between 0.001% and 1%. In some embodiments, the buffer composition includes a detergent (e.g., Triton X-100 or Tween-20) at a concentration of between 0.0025% and 0.0125%.
In some embodiments, the disclosed modified polymerase compositions (and related methods, systems, devices, and kits) can be used to obtain sequence information from a nucleic acid molecule. Many methods of obtaining sequence information from nucleic acid molecules are known in the art, and it will be readily appreciated that all such methods are within the scope of the present invention. Suitable sequencing methods using the disclosed modified polymerases include (but are not limited to): sanger sequencing (Sanger sequencing), ligation-based sequencing (also known as sequencing by hybridization), and sequencing by synthesis. Sequencing-by-synthesis methods typically involve template-dependent nucleic acid synthesis (e.g., using primers that hybridize to a template nucleic acid or a self-priming template, as will be appreciated by one of ordinary skill in the art) based on the sequence of the template nucleic acid. That is, the sequence of the newly synthesized nucleic acid strand is typically complementary to the template nucleic acid sequence, and thus knowledge of the order and identity of incorporation of nucleotides into the synthesized strand can provide information about the sequence of the template nucleic acid strand. Sequencing-by-synthesis using the modified polymerases of the invention will typically involve detecting the order and identity of nucleotide incorporation when the nucleotides are polymerized in a template-dependent manner by the modified polymerase. In some embodiments, sequencing-by-synthesis may include optical single molecule sequencing (e.g., sequencing in the absence of labeled nucleotides). Alternatively, some exemplary methods of sequencing-by-synthesis using labeled nucleotides include single molecule sequencing (see, e.g., U.S. Pat. nos. 7,329,492 and u.s.7,033,764), which typically involves the use of labeled nucleotides to detect nucleotide incorporation. In some embodiments, the disclosed polymerase compositions (and related methods, kits, systems, and devices) can be used to obtain sequence information. In some embodiments, the disclosed modified polymerases can be used to obtain sequence information for: whole genome sequencing, amplicon sequencing, targeted resequencing, single molecule sequencing, multiplex and/or barcoded sequencing or paired-end sequencing applications, and the like.
In some embodiments, the disclosed modified polymerase compositions and related methods, systems, devices, and kits can be used to amplify nucleic acid molecules. In some embodiments, the nucleic acid molecule can be amplified by any suitable method using a modified polymerase. In some embodiments, the nucleic acid molecule can be amplified, for example, by pyrosequencing, ion-based ISFET sequencing, PCR, emulsion PCR, or bridge polymerase chain reaction.
In some embodiments, the disclosed modified polymerase compositions (and related methods, systems, devices, and kits) can be used to generate nucleic acid libraries. In some embodiments, the disclosed modified polymerase compositions can be used to generate nucleic acid libraries for a variety of downstream processes. Many methods for generating nucleic acid libraries are known in the art and will be readily appreciated, all such methods are within the scope of the present invention. Depending on the polymerization, the appropriate methodMethods include, but are not limited to, nucleic acid libraries generated using emulsion PCR, bridge PCR, qPCR, RT-PCR, nested patch PCR, and other forms of nucleic acid amplification. In some embodiments, the method may comprise template-dependent nucleic acid amplification. In some embodiments, the method may include a primer, template duplex or nucleic acid template, from which the modified polymerase may perform nucleotide incorporation. In some embodiments, the nucleic acid may comprise a single-stranded nucleic acid having a secondary structure, such as a hairpin or stem-loop, which may provide a single-stranded overhang into which a polymerase modified during polymerization may incorporate nucleotides. In some embodiments, methods of generating a nucleic acid library using one or more of the modified polymerases according to the present disclosure can include generating a nucleic acid library having a length of 50, 100, 200, 300, 400, 500, 600, 700, 800, or more base pairs. In some embodiments, the nucleic acid template to which the modified polymerase can perform nucleotide incorporation can be attached, linked, or bound to a support, such as a solid support. In some embodiments, the support may comprise a planar support, such as a slide or flow cell. In some embodiments, the support can include particles, such as nucleic acid sequencing beads (e.g., Ion Sphere) TMParticles (life technologies, ca).
In some embodiments, the present invention generally relates to a method for generating a nucleic acid library comprising contacting a nucleic acid template with a modified polymerase and one or more dntps under polymerization conditions; thereby incorporating one or more dntps into a nucleic acid template to generate the nucleic acid library. In some embodiments, the method may further comprise generating or sequencing a nucleic acid library in the presence of a high ionic strength solution. In some embodiments, the present invention generally relates to modified polymerases that retain polymerase activity in the presence of high ionic strength solutions. In some embodiments, the high ionic strength solution may be at least 120mM salt. In some embodiments, the high ionic strength solution may be 125mM to 200mM salt. In some embodiments, the salt may include a potassium salt and/or a sodium salt. In some embodiments, the salt may include NaCl and/or KCl. In some embodiments, the high ionic strength solution may further comprise a sulfate. In some embodiments, the modified polymerase is capable of amplifying (and/or sequencing) a nucleic acid molecule in the presence of a high ionic strength solution to a greater capacity (e.g., as measured by accuracy) than a reference polymerase lacking one or more of the corresponding amino acid mutations under the same conditions. In some embodiments, the modified polymerase is capable of amplifying (and/or sequencing) a nucleic acid molecule in the presence of a high ionic strength solution to a greater capacity (e.g., as measured by thermostability) under the same conditions as a reference polymerase lacking one or more of the amino acid mutations. In some embodiments, the modified polymerase is capable of amplifying (and/or sequencing) a nucleic acid molecule in the presence of a high ionic strength solution to a greater capacity (e.g., as measured by sustained synthesis capacity) compared to a reference polymerase lacking one or more of the amino acid mutations under the same conditions.
Optionally, the method further comprises repeating the addition of one or more dntps under polymerization conditions to incorporate a plurality of dntps into the nucleic acid template to generate the nucleic acid pool.
In some embodiments, the method can further comprise detecting nucleotide incorporation by-products during the polymerization. In some embodiments, the nucleotide incorporation byproducts can include hydrogen and/or phosphate ions.
In some embodiments, the method further comprises determining the identity of dntps incorporated in the nucleic acid library. In some embodiments, the method further comprises determining the number of nucleotides incorporated in the nucleic acid library. In some embodiments, detecting may further comprise sequencing the nucleic acid library.
In some embodiments, the disclosed modified polymerase compositions (and related methods, systems, devices, and kits) can be used to detect nucleotide incorporation via the production of byproduct formation during a nucleotide incorporation event. Many methods of detecting nucleotide incorporation by-products are known in the art and will be readily appreciated, all such methods are within the scope of the present invention. Suitable methods for nucleotide byproduct detection include, but are not limited to, detection of hydrogen ions, inorganic phosphates, inorganic pyrophosphates, and the like. Several of these byproduct detection methods typically involve template-dependent nucleotide incorporation.
In some embodiments, the modified polymerases of the present invention can be used to perform label-free nucleic acid sequencing, and in particular ion-based nucleic acid sequencing. The concept of label-free nucleic acid sequencing (including ion-based nucleic acid sequencing) includes the following references, which are incorporated by reference in their entirety: rothberg et al, U.S. patent publication nos. 2009/0026082, 2009/0127589, 2010/0301398, 2010/0300895, 2010/0300559, 2010/0197507, and 2010/0137143, which are incorporated herein by reference in their entirety. Briefly, in such nucleic acid sequencing applications, nucleotide incorporation is determined by detecting the presence of natural by-products of polymerase-catalyzed nucleic acid synthesis reactions, including hydrogen ions, polyphosphates, PPi, and Pi (e.g., in the presence of pyrophosphatase).
In typical embodiments of ion-based nucleic acid sequencing, nucleotide incorporation is detected by detecting the presence and/or concentration of hydrogen ions generated by a polymerase-catalyzed nucleic acid synthesis reaction (including, for example, a primer extension reaction). In one embodiment, a template operably bound to a primer and a polymerase and located within a reaction chamber (microwell as disclosed in Rothberg et al, cited above) is subjected to repeated cycles of polymerase-catalyzed nucleotide addition to the primer ("addition step") followed by washing ("washing step"). In some embodiments, such templates may be attached to a solid support, such as a microparticle, nucleic acid sequencing bead, or the like, in the form of a clonal population, and the clonal population loaded into a reaction chamber. As used herein, "operably bound" means that the primer is bound to the template such that the primer can be extended by the polymerase and the polymerase binds to or is in close proximity to such primer-template double helix such that primer extension occurs whenever sufficient nucleotides are supplied.
In each addition step of the cycle, the polymerase extends the primer by incorporating the added nucleotide in a template-dependent manner, such that the nucleotide is incorporated only when the next base in the template is the complement of the added nucleotide. If there is one complementary base, there is one incorporation, if there are two complementary bases, there is two incorporation, if there are three complementary bases, there are three incorporations, and so on. For each such incorporation, hydrogen ions are released, and the template releases a population of hydrogen ions that collectively alter the local pH of the reaction chamber. In some embodiments, the production of hydrogen ions is directly proportional (e.g., monotonically correlated) to the number of consecutive complementary bases in the template (and the total number of template molecules with primers and polymerase participating in the extension reaction). Thus, when there are many consecutive identical complementary bases (i.e., homopolymer regions) in the template, the number of hydrogen ions generated, and thus the magnitude of the local pH change, is directly proportional to the number of consecutive identical complementary bases. If the next base in the template is not complementary to the added nucleotide, then no incorporation occurs and no hydrogen ions are released.
In some embodiments, after each step of adding nucleotides, a washing step is performed in which unbuffered wash solutions at a predetermined pH are used to remove the nucleotides of the preceding step to prevent misincorporation (incomplete extension) in subsequent cycles. In some embodiments, after each step of adding nucleotides, an additional step may be performed in which the reaction chamber is treated with a nucleotide destroying agent (such as apyrase) to eliminate any residual nucleotides remaining in the chamber, thereby minimizing the chance of false extension in subsequent cycles. In some embodiments, the treatment may be included as part of the washing step itself.
In one exemplary embodiment, different species (or "types") of nucleotides are added to the reaction chamber in sequence such that each reactant is exposed to a different nucleotide type one at a time. For example, the nucleotide types may be added in the following order: dATP, dCTP, dGTP, dTTP, etc.; wherein each exposure is followed by a washing step. The cycle may be repeated 50 times, 100 times, 200 times, 300 times, 400 times, 500 times, 750 times or more depending on the length of sequence information required. In some embodiments, the time it takes to apply the various nucleotides sequentially into the reaction chamber (i.e., flow cycling) may vary depending on the sequencing information desired. For example, when sequencing long nucleic acid molecules, flow cycling may be reduced in some cases to reduce the total time required to sequence the entire nucleic acid molecule. In some embodiments, flow cycling may be increased, for example, when sequencing short nucleic acids or amplicons. In some embodiments, the flow cycle may be about 0.5 seconds to about 3 seconds. In some embodiments, the flow cycle may be about 1 second to about 1.5 seconds.
In one embodiment, the invention generally relates to a method of detecting nucleotide incorporation, comprising: using a modified polymerase to perform nucleotide incorporation and produce one or more byproducts of the nucleotide incorporation; and detecting the presence of at least one of the one or more byproducts of the nucleotide incorporation, thereby detecting the nucleotide incorporation.
In some embodiments, the method may further comprise repeating the performing and detecting steps at least once. In some embodiments, the modified polymerase exhibits increased read length and/or processivity relative to a reference polymerase under otherwise similar or identical reaction conditions.
In some embodiments, detecting the presence of a sequencing byproduct comprises contacting the reaction mixture with a sensor capable of sensing the presence of the sequencing byproduct. The sensor may comprise a field effect transistor, such as a chemFET or an ISFET. In some embodiments, the nucleotide-incorporated sequencing byproducts can include hydrogen ions, dye-linking moieties, polyphosphates, pyrophosphate, or phosphate moieties, and detecting the presence of the sequencing byproducts comprises detecting the sequencing byproducts using an ISFET. In some embodiments, the detecting step comprises detecting hydrogen ions using an ISFET.
In some embodiments, the modified polymerase includes a polymerase linked to a bridging moiety. The bridging moiety is optionally linked to the polymerase via one or more attachment sites within the modified polymerase. In some embodiments, the bridging moiety is connected to the polymerase via a linking moiety. The linking moiety may be linked to at least one of the one or more attachment sites of the polymerase. In some embodiments, the polymerase of the modified polymerase includes a single attachment site, and the bridging moiety is connected to the polymerase directly via the single attachment site or via the linking moiety. In some embodiments, a single attachment site may be linked to a biotin moiety, and the bridging moiety may include an avidin moiety. In some embodiments, the bridging moiety is linked to the polymerase via at least one biotin-avidin linkage. In some embodiments, the modified polymerase exhibits increased read length and/or processivity and/or read accuracy, increased total throughput, reduced strand bias, reduced systematic error relative to a reference polymerase under otherwise similar or identical reaction conditions.
In some embodiments, the present invention relates generally to a method of detecting a change in ion concentration during a nucleotide polymerization reaction, comprising: performing a nucleotide polymerization reaction using a modified polymerase comprising a polymerase linked to a bridging moiety, wherein the concentration of at least one type of ion changes during the time course of the nucleotide polymerization reaction; and detecting a signal indicative of a change in concentration of at least one type of ion.
In some embodiments, the present invention relates generally to a method of detecting a change in ion concentration during a nucleotide polymerization reaction, comprising: performing a nucleotide polymerization reaction using a modified polymerase comprising a polymerase linked to a bridging moiety, wherein the concentration of at least one type of ion changes during the time course of the nucleotide polymerization reaction; and detecting a signal indicative of a change in concentration of at least one type of ion.
In some embodiments, the method may further comprise repeating the performing and detecting steps at least once. In some embodiments, detecting a change in concentration of at least one type of ion comprises using a sensor capable of sensing the presence of a byproduct. The sensor may comprise a field effect transistor, such as a chemFET or an ISFET. In some embodiments, at least the ion type comprises a hydrogen ion, a polyphosphate, a pyrophosphate, or a phosphate moiety, and detecting a change in concentration of at least one type of ion comprises detecting the at least one type of ion using an ISFET. In some embodiments, the at least one type of ion comprises a hydrogen ion, and detecting the presence of the at least one type of ion comprises detecting the hydrogen ion using an ISFET.
In some embodiments, the present invention generally relates to methods (and related kits, systems, devices, and compositions) for performing a nucleotide polymerization reaction comprising or consisting of contacting a modified polymerase or biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase, and wherein the modified polymerase or biologically active fragment thereof has improved accuracy, coverage, and/or processivity as compared to a reference polymerase; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof.
In some embodiments, the present invention generally relates to methods (and related kits, systems, devices, and compositions) for performing a nucleotide polymerization reaction comprising or consisting of contacting a modified polymerase or a biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or the biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase, and wherein the modified polymerase or the biologically active fragment thereof has increased thermostability relative to the reference polymerase; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof. In some embodiments, the method comprises polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution may comprise a solution in excess of 100mM KCl. In some embodiments, the high ionic strength solution comprises a solution of at least 120mM KCl. In some embodiments, the high ionic strength solution comprises a solution of 125mM to 200mM KCl.
In some embodiments, the method can further comprise polymerizing one of the at least one nucleotide in a template-dependent manner. In some embodiments, the polymerization is conducted under thermal cycling conditions. In some embodiments, the method may further comprise hybridizing a primer to the nucleic acid template before, during, or after the contacting, and wherein the polymerizing comprises polymerizing one of the at least one nucleotide onto the end of the primer using the modified polymerase or biologically active fragment thereof. In some embodiments, the polymerizing is performed in the vicinity of a sensor capable of detecting the polymerization of at least one nucleotide by the modified polymerase or biologically active fragment thereof. In some embodiments, the method may further comprise using a sensor to detect a signal indicative of polymerization of the at least one nucleotide by the modified polymerase or biologically active fragment thereof. In some embodiments, the sensor is an ISFET. In some embodiments, the sensor may comprise a detectable label or detectable reagent within the polymerization reaction.
In some embodiments, the present invention generally relates to methods (and related kits, devices, systems, and compositions) for performing nucleic acid amplification comprising or consisting of generating an amplification reaction mixture having a modified polymerase, typically 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:34, or a biologically active fragment thereof, a primer, a nucleic acid template, and one or more nucleotides, wherein the modified polymerase or the biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase and has improved thermostability relative to the reference polymerase; and subjecting the amplification reaction mixture to amplification conditions, wherein at least one of the one or more nucleotides is polymerized onto the ends of the primers using the modified polymerase or biologically active fragment thereof. In some embodiments, a modified polymerase or biologically active fragment thereof having improved thermostability relative to a reference polymerase (e.g., SEQ ID NO:1 or SEQ ID NO:34) comprises or consists of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the present invention generally relates to methods (and related kits, devices, systems, and compositions) for performing nucleic acid amplification comprising or consisting of generating an amplification reaction mixture having a modified polymerase, typically 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:34, or a biologically active fragment thereof, a primer, a nucleic acid template, and one or more nucleotides, wherein the modified polymerase or the biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase and has improved accuracy relative to the reference polymerase; and subjecting the amplification reaction mixture to amplification conditions, wherein at least one of the one or more nucleotides is polymerized onto the ends of the primers using the modified polymerase or biologically active fragment thereof. In some embodiments, a modified polymerase or biologically active fragment (e.g., SEQ ID NO:1 or SEQ ID NO:34) having improved accuracy relative to a reference polymerase comprises or consists of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33. In an illustrative embodiment, the method is an emulsion PCR method. Thus, it will be appreciated that the amplification reaction mixture is added to the emulsion oil composition, followed by exposure of the nucleic acid template to amplification conditions. The addition may be performed for a period of several seconds rather than all at once and may occur while the emulsion oil composition is being stirred. The solution comprising the emulsion oil and the amplification reaction mixture may then be stirred, for example for 30 seconds to 30 minutes, 1 minute to 20 minutes, 2 minutes to 10 minutes or for example 5 minutes, followed by exposure to amplification conditions. Amplification may occur after dispensing the reaction mixture into a PCR compatible location (which may then be loaded onto a thermal cycler) after stirring. In certain embodiments, emulsion PCR is performed in a reaction mixture comprising 120 to 200mM salt, such as 120mM to 150mM KCl.
In some embodiments, the method further comprises determining the identity of one or more nucleotides polymerized by the modified polymerase. In some embodiments, the method further comprises determining the number of nucleotides polymerized by the modified polymerase. In some embodiments, at least 50% of the one or more nucleotides polymerized by the modified polymerase are identified. In some embodiments, substantially all of the one or more nucleotides polymerized by the modified polymerase are identified. In some embodiments, the polymerization occurs in the presence of a high ionic strength solution. In some embodiments, the high ionic strength solution comprises 125mM to 200mM salt. In some embodiments, the polymerization occurs in the presence of an ionic strength solution of at least 120mM salt. In some embodiments, the high ionic strength solution comprises KCl and/or NaCl.
In some embodiments, the invention generally relates to methods (and related kits, systems, devices, and compositions) for performing nucleotide polymerization reactions, the methods comprising or consisting of mixing a modified polymerase or biologically active fragment thereof with a nucleic acid template in the presence of one or more nucleotides, wherein the modified polymerase or biologically active fragment thereof comprises one or more amino acid modifications relative to a reference polymerase (such as SEQ ID NO:1 or SEQ ID NO: 34; and polymerizing at least one of the one or more nucleotides using the modified polymerase or biologically active fragment thereof in a mixture. A high ionic strength solution refers to a reaction mixture for conducting nucleotide polymerization having at least 120mM KCl. In some embodiments, the high ionic strength solution comprises a solution of 125mM to 200mM KCl.
In some embodiments, the methods (and related kits, devices, systems, and compositions) comprise a modified polymerase or biologically active fragment thereof comprising or consisting of at least 80% identity to: SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 32 or SEQ ID NO 33.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for detecting nucleotide incorporation, the methods comprising or consisting of performing a nucleotide incorporation reaction using a modified polymerase or a biologically active fragment thereof having at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; generating the nucleotide incorporation; and detecting the nucleotide incorporation. Detection of nucleotide incorporation can be performed via any suitable means, such as PAGE, fluorescence, dPCR quantification, nucleotide byproduct generation (e.g., hydrogen ion or pyrophosphate detection; suitable nucleotide byproduct detection systems include, but are not limited to, next generation (i.e., massively parallel, high throughput) sequencing platforms such as Rain Dance, Roche 454, and ion torrent systems), or nucleotide extension product detection (e.g., optical detection of extension products or detection of labeled nucleotide extension products). In some embodiments, the methods for detecting nucleotide incorporation (and related kits, systems, devices, and compositions) include or consist of detecting nucleotide incorporation using a modified polymerase or biologically active fragment thereof that includes at least 95% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method of detecting nucleotide incorporation comprises or consists of detecting nucleotide incorporation using a modified polymerase or biologically active fragment thereof comprising at least 98% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method of detecting nucleotide incorporation comprises or consists of detecting nucleotide incorporation by a modified polymerase or biologically active fragment thereof comprising at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the method further comprises determining the identity of one or more nucleotides in the nucleotide incorporation. In some embodiments, the byproduct of nucleotide incorporation is a hydrogen ion. In some embodiments, the byproduct of nucleotide incorporation is pyrophosphate. In some embodiments, the byproduct of nucleotide incorporation is a labeled nucleotide extension product. In some embodiments, the method of detecting nucleotide incorporation comprises generating nucleotide incorporation under emulsion PCR or bridge PCR conditions.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for detecting changes in ion concentration during a nucleotide polymerization reaction comprising or consisting of performing a first nucleotide polymerization reaction on a nucleic acid template or nucleic acid library in the presence of one or more nucleotides to be incorporated during the first nucleotide polymerization reaction, wherein the first nucleotide polymerization reaction comprises a modified polymerase or biologically active fragment thereof having at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and performing a second nucleotide polymerization reaction, wherein the second nucleotide polymerization reaction detects changes in the concentration of at least one type of ion during a second nucleotide polymerization reaction time course and provides a signal indicative of changes in the concentration of at least one type of ion. In some embodiments, the ions are hydrogen ions. In some embodiments, the ion is a pyrophosphate ion. In some embodiments, the signal indicative of the change in ion concentration is a relative increase in hydrogen ion production in the polymerization reaction. In some embodiments, the detection of at least one type of ion concentration change is monitored using an ISFET. In some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 150 consecutive amino acid residues of a polymerase having at least 90% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 200 contiguous amino acid residues of a polymerase having at least 95% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of at least 250 contiguous amino acid residues of a polymerase having at least 98% identity to the sequence: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. in some embodiments, the modified polymerase or biologically active fragment from the first nucleotide polymerization reaction comprises or consists of a polymerase having at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for amplifying nucleic acids comprising or consisting of contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the amplification is performed using polymerase chain reaction, emulsion polymerase chain reaction, isothermal amplification reaction, recombinase polymerase amplification reaction, proximity ligation amplification, rolling circle amplification, or strand displacement amplification. In some embodiments, amplifying comprises clonally amplifying the nucleic acid in solution. In some embodiments, the amplifying comprises clonally amplifying the nucleic acids on a solid support, such as a nucleic acid bead, a flow cell, a nucleic acid array, or a well present on a surface of the solid support. In some embodiments, amplification is performed using a polymerase or biologically active fragment comprising a thermostable DNA polymerase. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase with improved thermostability as compared to a reference polymerase, such as SEQ ID NO:1 or SEQ ID NO: 34. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase with improved accuracy compared to a reference polymerase, such as SEQ ID NO:1 or SEQ ID NO: 34.
In some embodiments, the methods for amplifying nucleic acids (and related kits, systems, devices, and compositions) comprise contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the polymerase or biologically active fragment comprises a DNA polymerase having an improved average read length compared to the average read length obtained using a DNA polymerase encoded by the following sequence under the same amplification conditions: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 95% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 98% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, a method for amplifying a nucleic acid comprises contacting the nucleic acid with a polymerase or biologically active fragment thereof comprising at least 99% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, the methods include polymerases or biologically active fragments having an improved average read length compared to an average read length obtained using a DNA polymerase encoded by: SEQ ID NO 1 or SEQ ID NO 34.
In some embodiments, the average read length is determined by: the read length of amplified nucleic acid obtained using one or more of the modified polymerases provided herein is analyzed across all reads to establish an average read length, and the average read length is compared to an average read length obtained using a reference polymerase.
In some embodiments, the present invention relates generally to a method for amplifying a nucleic acid, comprising or consisting of contacting a nucleic acid with a polymerase or biologically active fragment thereof comprising at least 80% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33; and amplifying the nucleic acid. In some embodiments, amplification is performed by a polymerase or biologically active fragment with improved templating efficiency compared to a reference sample, such as SEQ ID NO:1 or SEQ ID NO: 34. In some embodiments, the method for amplifying nucleic acid comprises amplifying nucleic acid under emulsion PCR conditions. In some embodiments, the method for amplifying a nucleic acid comprises amplifying a nucleic acid under bridge PCR conditions. In some embodiments, the bridge PCR conditions comprise hybridizing one or more of the amplified nucleic acids to a solid support. In some embodiments, the hybridized one or more amplified nucleic acids can be used as a template for further amplification. In some embodiments, the modified polymerase or biologically active fragment thereof comprises the polymerase SEQ ID NO 1 derived from Thermus aquaticus DNA polymerase (Taq) is the full-length wild-type nucleic acid sequence of Thermus aquaticus (Taq) DNA polymerase. In some embodiments, Taq DNA polymerase may be used as a reference polymerase in the methods, kits, devices, systems, and compositions described herein.
In some embodiments, the present invention relates generally to methods (and related kits, systems, devices, and compositions) for synthesizing nucleic acids comprising or consisting of incorporating at least one nucleotide onto an end of a primer using a modified polymerase or biologically active fragment thereof having at least 90% identity to: SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID NO: 4. SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 11. SEQ ID NO: 12. SEQ ID NO: 13. SEQ ID NO: 14. SEQ ID NO: 15. SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20. SEQ ID NO: 21. SEQ ID NO: 22. SEQ ID NO: 23. SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27. SEQ ID NO: 28. SEQ ID NO: 29. SEQ ID NO: 30. SEQ ID NO: 31. SEQ ID NO:32 or SEQ ID NO: 33. optionally, the method further comprises detecting incorporation of at least one nucleotide onto the end of the primer. In some embodiments, the method further comprises determining the identity of at least one of the at least one nucleotide incorporated onto the end of the primer. In some embodiments, the method can include determining the identity of all nucleotides incorporated onto the ends of the primer. In some embodiments, the method comprises synthesizing the nucleic acid in a template-dependent manner. In some embodiments, the method may comprise synthesizing the nucleic acid in solution, on a solid support, or in an emulsion (such as emPCR).
In some embodiments, provided herein are kits comprising at least two containers (e.g., test tubes) each containing one or more reaction mixtures or reaction mixture components as provided herein, such as nucleotide triphosphates and/or buffers effective for nucleic acid polymerization, amplification, and/or sequencing reactions. At least one of the containers comprises a modified polymerase of the invention or a biologically active fragment thereof, and one or more other tubes may comprise, for example, nucleotides and/or buffers suitable for one of the methods provided herein. In some embodiments, the kit may be a virtual kit in which multiple individual reagents are listed, sold and/or sold together, such as on a web page or smartphone application that lists different reagents that may be purchased together.
In some embodiments, a kit suitable for conducting a nucleic acid polymerization reaction comprises a buffer, at least one type of nucleotide triphosphate, and a modified polymerase or biologically active fragment thereof of the invention.
In some embodiments, a kit suitable for conducting a nucleic acid polymerization reaction comprises a buffer, at least one type of nucleotide triphosphate, optionally a salt (such as sodium chloride or potassium chloride, and optionally MgCl) 2) And a second tube comprising the modified polymerase of the invention or a biologically active fragment thereof. The tube containing the polymerase may include a stabilizer and other components, such as glycerolAnd detergents such as, for example, Tween-20 or NP-40. In some illustrative embodiments, a kit may comprise a third container containing a solid support, such as a bead, optionally containing primers for performing the methods provided herein. In a certain embodiment, the kit may further comprise a tube comprising an oil suitable for forming an emulsion for emulsion PCR and optionally an emulsion stabilizer. By way of example, and not intended to be limiting, such tubing may include biocompatible mineral oil, Allox 4912, and Span 80,
in some embodiments, one test tube of the kit comprises a buffer composition comprising any one or more of: monovalent metal salts, divalent anions, and detergents. For example, the buffer composition may include a potassium or sodium salt. For example, the buffer composition may include a potassium or sodium salt. For example, the buffer composition may include 50 to 200mM salt, 50 to 100mM salt, 120 to 200mM salt, such as 120 to 150mM KCl. In some embodiments, the buffer composition may include a manganese or magnesium salt. In some embodiments, the buffer composition may include a sulfate salt, such as potassium sulfate and/or magnesium sulfate. In some embodiments, the buffer composition may include a detergent. In some embodiments, the buffer composition may include a detergent, such as Triton and/or Tween.
In some embodiments, the kit comprises a test tube having a buffer composition comprising at least one potassium salt, at least one manganese salt, and Triton X-100(Pierce Biochemicals). The salt may optionally include a chloride salt or a sulfate salt. In some embodiments, the buffer composition may comprise a pH of about 7.3 to about 8.0. In some embodiments, the buffer composition can include a pH of about 7.4 to about 7.9. In some embodiments, the buffer composition comprises a potassium salt (divalent dependent) at a concentration of between 5-250mM, 50-225mM, 125-200mM
In some embodiments, the buffer composition included in the kit tubes includes a magnesium or manganese salt at a concentration between 1mM and 20 mM. In some embodiments, the buffer composition comprises a magnesium or manganese salt at a concentration between 6-15 mM. In some embodiments, the buffer composition comprises sulfate at a concentration between 1mM and 100 mM. In some embodiments, the buffer composition comprises sulfate at a concentration between 5-50 mM. In some embodiments, the buffer composition of the kit comprises a detergent (e.g., Triton X-100 or Tween-20) at a concentration of between 0.001% and 1%. In some embodiments, the buffer composition includes a detergent (e.g., Triton X-100 or Tween-20) at a concentration of between 0.0025% and 0.0125%.
In some embodiments, the kit further comprises a nucleic acid capture bead. These kits may include test tubes with an emulsion oil (e.g., mineral oil).
The following non-limiting examples are provided by way of illustration of exemplary embodiments only and in no way limit the scope and spirit of the present invention. Moreover, it is to be understood that any invention disclosed or claimed herein encompasses all variations, combinations, and permutations of any one or more of the features described herein. Any one or more features may be explicitly excluded from the claims even if a particular exclusion is not explicitly set forth herein. It is also to be understood that the disclosure of reagents for use in the methods is intended to be synonymous with (and provide support for) the methods involving the use of the reagents, in accordance with the particular methods disclosed herein or other methods known in the art, unless a person of ordinary skill in the art would otherwise understand. In addition, where the specification and/or claims disclose a method, any one or more of the reagents disclosed herein may be used in the method, unless the skilled artisan would otherwise understand it.
Examples of the invention
Example 1: production and purification of exemplary modified polymerases
Introduction of amino acid mutations via site-directed mutagenesis into an exemplary reference polymerase having the following amino acid sequence: 1 in SEQ ID NO. In this example, wild-type full-length Taq DNA polymerase (832 amino acids in length) was used as a reference polymerase from which exemplary mutations were introduced via site-directed saturation mutagenesis.
Here, the 831 amino acid residues in SEQ ID NO:1 after methionine (at amino acid position 1 of SEQ ID NO: 1) are replaced by every possible amino acid at every amino acid residue along the polymerase. Recombinant expression constructs encoding these modified polymerases are transformed into bacteria. Colonies containing the expression construct were inoculated into BRM medium, grown to an OD of 0.600, and induced by addition of IPTG to a final concentration of 1 mM. The cells were then grown for an additional 3 hours at 37 ℃.
The induced cells were centrifuged at 6000rpm for 10 minutes, the supernatant was discarded, and the cells were resuspended in resuspension buffer (10mM Tris, pH 7.5, 100mM NaCl). The resuspended cells were sonicated for one minute at a setting (amplitude) of 60 and then placed on ice for 1 minute. Sonication was repeated a total of 5 times in this manner. The samples were incubated at 65 ℃ for 10 minutes. The samples were centrifuged at 9000rpm for 30 minutes. The supernatant was recovered and further purified on a heparin column.
Assessing the expression and/or polymerase activity of the purified polymerase compared to the following sequence: 1 in SEQ ID NO. The number of amino acid residues substituted along the entire length of WT Taq DNA polymerase was 831. The average number of amino acid variants observed at each amino acid residue along the polymerase is 17.8 variants per amino acid residue. The total number of polymerase clones (each consisting of a single amino acid substitution compared to SEQ ID NO: 1) achieved using this method was 14,833. The number of clones with characteristics far superior to those observed for polymerase performance under standard emulsion PCR conditions using SEQ ID NO 1 was 332 clones. These superior features or characteristics include thermal stability, and/or polymerase activity in 125mM KCl or NaCL, and/or at least one of the following features or characteristics associated with the sequencing reaction when a mutant polymerase is used in the template emulsion PCR amplification step of the nucleic acid being analyzed in the sequencing reaction: read length, accuracy, strand bias, systematic error and total sequencing throughput. The number of clones assessed for secondary features was 31 clones, each of which consisted of a single amino acid substitution compared to the following sequence: 1 in SEQ ID NO. As provided herein, exemplary modified polymerases SEQ ID NO:5 through SEQ ID NO:33 are modified polymerases consisting of a single amino acid substitution compared to the following sequences: 1 in SEQ ID NO.
Example 2: production and purification of exemplary double, triple and quadruple modified polymerases
Double, triple and quadruple amino acid substitutions are introduced via site-directed mutagenesis into an exemplary reference polymerase having the following amino acid sequence: 1 in SEQ ID NO. In this example, wild-type full-length Taq DNA polymerase (SEQ ID NO:1) was used as a reference polymerase, from which double, triple and quadruple amino acid mutations were introduced.
Here, a modified polymerase was prepared according to example 1. Several of the 31 clones with assessed secondary characteristics served as the basis for combining multiple single amino acid substitutions into the reference polymerase according to the method set forth in example 1.
Briefly, the relevant clones from example 1 were rated as polymerase performing better than WT Taq DNA polymerase under the same emPCR conditions. Selected individual amino acid substitutions are then introduced into a reference polymerase (SEQ ID NO:1) via site-directed mutagenesis to generate a variety of different double, triple and quadruple amino acid substitution polymerases. Recombinant expression constructs encoding these modified polymerases are transformed into bacteria. Colonies containing the expression construct were inoculated into BRM medium, grown to an OD of 0.600, and induced by addition of IPTG to a final concentration of 1 mM. The cells were then grown for an additional 3 hours at 37 ℃.
The induced cells were centrifuged at 6000rpm for 10 minutes, the supernatant was discarded, and the cells were resuspended in resuspension buffer (10mM Tris, pH 7.5, 100mM NaCl). The resuspended cells were sonicated for one minute at a setting (amplitude) of 60 and then placed on ice for 1 minute. Sonication was repeated a total of 5 times in this manner. The samples were incubated at 65 ℃ for 10 minutes. The samples were centrifuged at 9000rpm for 30 minutes. The supernatant was recovered and further purified on a heparin column.
Assessing the expression and/or polymerase activity of the purified duplex, triplex and quadruplex polymerases in comparison to the following sequences: 1 in SEQ ID NO. As provided herein, SEQ ID NO:3 and SEQ ID NO:4 represent exemplary dual and triple amino acid substitution polymerases, respectively, that have PCR performance superior to WT Taq DNA polymerase (SEQ ID NO:1) under emulsion PCR conditions. These superior features or characteristics include thermal stability, and/or polymerase activity in 125mM KCl or NaCL, and/or at least one of the following features or characteristics associated with the sequencing reaction when a mutant polymerase is used in the template emulsion PCR amplification step of the nucleic acid being analyzed in the sequencing reaction: read length, accuracy, strand bias, systematic error and total sequencing throughput. These improvements are believed to be at least partially a result of the increased capacity for sustained synthesis.
SEQ ID NO:3 consists of a double amino acid substitution polymerase (L763F + E805I) with numbering relative to WT Taq DNA polymerase (SEQ ID NO: 1).
SEQ ID NO:4 consists of a triple amino acid substitution polymerase (E397V + E745T + L763F), with numbering relative to WT Taq DNA polymerase (SEQ ID NO: 1).
Example 3: comparing the performance of a modified polymerase and a reference polymerase in emulsion PCR
A modified isolated polymerase comprising a mutant Taq DNA polymerase (SEQ ID NO:2) comprising the amino acid substitution E397V (wherein numbering is relative to the amino acid sequence of SEQ ID NO:1) was purified essentially as described in example 1. The performance of both the modified polymerase (SEQ ID NO:2) and the reference polymerase (SEQ ID NO:1) (control reaction) in an emulsion-based PCR reaction under the same conditions was then evaluated.
The pool of nucleic acid molecules to be amplified under emulsion PCR conditions comprises amplicons known to be particularly difficult to amplify under standard emPCR conditions. The pool of nucleic acid molecules included 55 amplicons with a high or very high GC content (> 60% GC); 42 amplicons with higher or very high AT content (> 60% AT); 299 ge amplicons comprising Homopolymer (HP) regions of varying lengths (e.g., 2HP-9 HP); 95 amplicons prematurely attenuated under standard emPCR conditions; 20 amplicons with an insert length of 320 bp; and 20 amplicons with an insert length of 420 bp.
Briefly, adaptor ligation and size selection of nucleic acid molecule libraries are performed as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion XpressTMThe protocol provided in the template set v2.0 user guide (Ion torrent system, part number 4469004a) (incorporated herein by reference in its entirety) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to expand libraries of nucleic acid molecules to Ion Sphere spheresTMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit. Followed by loading of the amplified nucleic acid molecule into PGMTM314 sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
Sequencing data obtained using a reference polymerase or a modified polymerase was analyzed to measure AQ20 Mean Read Length (MRL), strand bias, base coverage, accuracy, sequencing throughput (Mb), and uniformity of coverage.
Using PGMTMStandard software supplied by the sequencing system measures and compares sequencing reaction data for emPCR using a reference polymerase or a modified polymerase. An exemplary modified polymerase comprising amino acid substitution E397V (SEQ ID NO:2) provides for polymerization relative to a referenceEnzyme (SEQ ID NO:1) (data not shown) significantly increased AQ20 MRL reads, reduced strand bias, increased base coverage, increased accuracy, increased sequencing throughput (Mb), and increased uniformity of coverage.
The amino acid sequence corresponding to the modified polymerase provided in this example is SEQ ID NO 2. It will be immediately apparent to one of ordinary skill in the art that any one or more of the modified polymerases disclosed or suggested herein can be readily converted (e.g., reverse translated) to the corresponding nucleic acid sequence encoding the modified polymerase. It will also be apparent to the skilled artisan that the nucleic acid sequence of each polypeptide may vary due to the degenerate nature of the codons. For example, six codons (CTT, CTC, CTA, CTG, TTA, and TTG) exist for leucine. Thus, the base at position 1 of this codon may be C or T, the base at position 2 of this codon is always T, and the base at position 3 may be T, C, A or G. Thus, any modified polymerase disclosed or suggested herein can be translated into any one or more of the degenerate codon nucleic acid sequences.
Example 4: evaluation of the Performance of modified polymerases in high ionic Strength emulsion PCR
A variety of modified polymerases consisting of single amino acid substitutions were prepared according to example 1. The performance of the modified DNA polymerase in emulsion PCR was then evaluated to generate a nucleic acid library essentially according to example 3. The pool of nucleic acid molecules to be amplified under emulsion PCR conditions comprises amplicons known to be particularly difficult to amplify under standard emPCR conditions (see example 3).
In this example, a salt titration experiment was performed to determine the functionality of the modified polymerase under high ionic strength conditions. Salt titration included evaluation at 75mM salt, 100mM salt and high ionic strength solution (125mM salt). In this example, the high ionic strength condition comprises 125mM KCl.
Briefly, a library of nucleic acid molecules is adapted as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B)Connector engagement and size selection. Then according to Ion XpressTMThe protocol provided in the template set v 2.0 user guide (Ion torrent system, part number 4469004a) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to amplify libraries of nucleic acid molecules to Ion spheres TMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Followed by loading of the amplified nucleic acid molecule into PGMTM314 sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
Fig. 1A-1E depict exemplary results of several modified polymerases consisting of a single amino acid substitution prepared according to example 1 and evaluated under emPCR conditions at various salt concentrations. The first bar (read from left to right) at each salt concentration represents the sequencing throughput obtained for each modified polymerase. The second bar (read from left to right) at each salt concentration represents the average read length (MRL) obtained for each modified polymerase. The last bar (read from left to right) at each salt concentration represents the key signal (which is the control showing whether emPCR occurred). In general, for each of the five modified polymerases presented, the sequencing throughput levels were increased in reaction conditions including 125mM KCl compared to 75mM KCl during the emPCR reaction.
Example 5: evaluation of Performance of Dual amino acid substitution mutant polymerases in emulsion PCR
The performance of the modified polymerase comprising double amino acid substitutions (E397V + E745T, wherein numbering is relative to the amino acid residues of SEQ ID NO: 1) of Taq DNA polymerase prepared according to example 2 in an emulsion pcr (empcr) reaction was compared to Taq DNA polymerase with single amino acid substitutions (SEQ ID NO:34) to generate a nucleic acid library.
Nucleic acid pools were generated from Rhodopseudomonas palustris (Rhodopseudomonas palustris), which is an 5,459,213 base pair circular chromosome with a GC content of 65.05% (see Larimer et al, Nature Biotechnology, 2004, Vol.22, No. 1, pp.55-61), and evaluated under high ionic strength conditions (125mM salt; here, 125mM KCl).
Downstream application of libraries obtained from emPCR steps using modified polymerases to PGM using ion torrentTMIon-based sequencing reactions in a sequencing system (ion torrent system, part number 4462917).
Briefly, adaptor ligation and size selection were performed on the Rhodopseudomonas palustris library (here, the insert was a 420bp insert) as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion Xpress TMThe protocol provided in the template set v2.0 user guide (Ion torrent system, part number 4469004a) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to amplify libraries of nucleic acid molecules to Ion spheresTMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Subsequent loading of the amplified library into PGMTM314 side measureIn the chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
The resulting sequencing data obtained during high ionic strength emPCR using the modified polymerase was analyzed to measure the number, accuracy of AQ20 bases and the average perfect read length of the 420bp inserts. Data for exemplary sequencing operations performed as outlined in this example are shown in fig. 2a1-2B 2. As can be seen, the sequencing data from 125mM KCl empCR conditions obtained from the dual amino acid substitution polymerase (E397V + E745T) had improved AQ20 readings (386bp versus 359bp), improved reading accuracy (99.8% versus 99.6%), and improved perfect reading length (331bp versus 290bp) compared to the single amino acid substitution polymerase (SEQ ID NO: 34). Both modified DNA polymerases are capable of generating large amounts of sequencing data due to the large nucleic acid pools that have been generated under high ionic strength conditions during the emPCR process. However, the dual amino acid substitution polymerases outperform the single amino acid polymerase substitution under the same conditions (SEQ ID NO: 34).
Example 6: performance of amino acid substituted Taq polymerase mutants in emulsion PCR
The performance of multiple modified polymerases consisting of different single amino acid substitutions prepared according to example 1 in an emulsion PCR reaction was compared to a Taq DNA polymerase mutant (SEQ ID NO:34) also consisting of a single amino acid substitution to generate a nucleic acid library. Nucleic acid pools were generated from Rhodopseudomonas palustris and evaluated under various ionic strength conditions (e.g., 75mM to 150mM KCl).
Downstream application of libraries obtained from emPCR using modified polymerases to use ionsTorrent PGMTMIon-based sequencing reactions in a sequencing system (ion torrent system, part number 4462917).
Briefly, adaptor ligation and size selection were performed on the Rhodopseudomonas palustris library (here, the insert was a 420bp insert) as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion XpressTMThe protocol provided in the template set v 2.0 user guide (Ion torrent system, part number 4469004a) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to amplify libraries of nucleic acid molecules to Ion spheresTMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Subsequent loading of the amplified library into PGM TM314 sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
The resulting exemplary sequencing data obtained using the modified polymerase during emPCR was analyzed to measure the number of AQ20 total base counts, AQ17 mean, AQ20 mean, coverage uniformity, strand bias, total base coverage, Systematic Sequencing Error (SSE), and other benchmarks. Data for sequencing operations performed as outlined in this example are shown in figures 3A-3B. As can be seen, the sequencing data obtained from several of the single amino acid substitution polymerases had improved AQ20 total base counts, improved AQ20 reads, reduced strand bias, and reduced SSE compared to the single amino acid substitution polymerase (SEQ ID NO: 34). For example, the single amino acid substitution polymerases (E397V, E794C, or R593G) each outperform the single amino acid polymerase substitution (SEQ ID NO:34) under the same conditions.
Example 7: evaluation of mutant polymerases for thermal stability (GC coverage)
The performance of a modified polymerase comprising a single amino acid substitution (E397V) to Taq DNA polymerase having the amino acid sequence of SEQ ID NO. 2 in an emulsion PCR reaction was compared to a different single amino acid substitution (SEQ ID NO:34) of Taq DNA polymerase to generate a nucleic acid library. The nucleic acid pool was produced by rhodopseudomonas palustris containing more than 65% GC content. Each mutant polymerase was evaluated for its ability to function in high ionic strength solution (e.g., 125mM KCl) during emPCR.
Downstream application of the library obtained from the emPCR step using modified Taq DNA polymerase to PGM using ion torrentTMIon-based sequencing reactions in a sequencing system (ion torrent system, part number 4462917).
Briefly, adaptor ligation and size selection were performed on the Rhodopseudomonas palustris library (here, the insert was a 420bp insert) as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion XpressTMThe protocol provided in the template set v 2.0 user guide (Ion torrent system, part number 4469004a) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to amplify libraries of nucleic acid molecules to Ion spheres TMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Subsequent loading of the amplified library into PGMTM314 sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
The resulting exemplary sequencing data obtained using the modified Taq DNA polymerase as outlined in this example is shown in figure 4. In FIG. 4, "Taqlr 1" refers to SEQ ID NO:34 (D732R); while reference to "Hit # 1" refers to the amino acid substitution "E397V" (SEQ ID NO: 2). Taqlr1 or Taq-LR1 polymerase was engineered to have higher template affinity than wild-type Taq polymerase. This increased template affinity has allowed templating of longer libraries, presumably due to improved fidelity and sustained synthesis capability. As is evident from the data, the read length, sequencing throughput, and coverage uniformity of the single amino acid substitution polymerase "E397V" outperformed Taq-LR1(SEQ ID NO:34) under the same empCR conditions.
In addition, the nucleic acid pool amplified during emPCR is a pool from bacterial species with at least 65% GC content. It is well known in the art that nucleic acid amplification of nucleic acid molecules with high GC content is generally more difficult than for targets that are not GC-rich (McDowell et al, nucleic acids research 1998; 26: 3340-. It is also well known in the art that GC content predicts melting temperature. Thus, a high GC content genome will have a higher melting temperature than a lower GC content genome. Here, the "E397V" modified polymerase provides significantly greater amplification of high GC content nucleic acid libraries than the following sequences: SEQ ID NO: 34. Based on the global sequencing data in this example, it was determined that the modified polymerase (SEQ ID NO:2) caused less than 5 gaps of genome coverage per gigabyte of sequencing data; while SEQ ID NO 34 provides a genome coverage of at least 99 gaps per gigabyte of sequencing data.
The ability to accurately and faithfully sequence the content of GC-rich organisms and GC-rich regions is useful in a variety of fields, including bacterial research, where several bacterial species have GC contents greater than 65% (e.g., some streptomyces and mycobacteria species). Polymerases that produce a genome coverage of 99 or more gaps per gigabyte of sequencing data are less suitable for DNA sequencing, detection methods, and the like, than polymerases that produce genome maps with a genome coverage of less than 20, 10, or 5 gaps per gigabyte of sequencing data. In order for the user to complete the genome using the previous polymerase, the user would need to perform additional amplification of the genome using a "trim reaction" that includes designing and purchasing a pair of primers for each gap of sequencing data per gigabyte. Upon successful design, 99 or more primer pair reactions per gigabyte of sequencing data must undergo sufficient amplification to cover the gaps that exist across the entire genome. However, with the latter polymerase (e.g., SEQ ID NO:2), the user can establish most Rhodopseudomonas palustris genomes in a single empCR reaction. The user only needs to prepare 5 primer pairs per gigabyte of sequencing data to complete the genome when necessary.
Without being limited by theory, a modified polymerase with improved GC content coverage as defined herein may also correspond to a modified polymerase with improved thermostability (see example 10). During emPCR, the nucleic acid pool must be denatured in order to perform the amplification step. If the GC content of the nucleic acid library is high, it is likely that the nucleic acid library will be less likely to denature, or that any primer present will bind correctly to the template strand, and therefore, the modified polymerase is less likely to initiate primer polymerization. The modified polymerase "E397V" was found to have a higher thermostability at 96 ℃ than SEQ ID NO:1 (see example 10). The modified polymerase "E397V" also exhibited higher coverage uniformity and longer read length at 97 ℃ compared to the same reaction at 96 ℃ (see fig. 4.) therefore, the modified polymerase "E397V" had greater thermostability compared to SEQ ID NO:1 under the same emPCR conditions.
Example 8: evaluation of polymerase Performance in emulsion PCR for double amino acid substitution mutants
In this example, four polymerases each with a double amino acid substitution (E397V + E745T; P6N + E295F; P6N + E397V or E745T + E794C) were prepared according to example 2 and their performance in an emPCR reaction under high ionic strength conditions was compared to Taq DNA polymerase with a single amino acid substitution (SEQ ID NO: 34).
Coli 500bp inserts were used to prepare nucleic acid libraries. Downstream application of nucleic acid libraries obtained from emPCR reactions to PGM using ion torrentTMIon-based sequencing reactions in a sequencing system (ion torrent system, part number 4462917).
Briefly, template DNA was purified, adapter ligated and size selected as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion XpressTMThe protocol provided in the template set v 2.0 user guide (Ion torrent system, part number 4469004a) and the reagents provided in the Ion template preparation set (Ion torrent system/life technologies, part number 4466461), Ion template reagent set (Ion torrent system/life technologies, part number 4466462) and Ion template solution set (Ion torrent system/life technologies, part number 4466463) were used to amplify libraries of nucleic acid molecules to Ion spheresTMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Subsequent loading of the amplified library into PGM TM318 in a sequencing chip. Loading of chips into ion torrent PGMTMIn a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a)The protocol provided and the reagents provided in the ion sequencing kit v2.0 (ion torrent systems/life technologies, part number 4466456) and the ion chip kit (ion torrent systems/life technologies, part number 4462923) were used for sequencing. The ion torrent system is a subsidiary of life technologies (carlsbad, california).
The resulting sequencing data obtained using the modified polymerase under high ionic strength emPCR (125mM KCl) was analyzed to measure AQ20 base number, raw read accuracy and true accuracy of Homopolymers (HP) of 5bp, 6bp or 7bp in length. Data for exemplary sequencing operations performed as outlined in this example are shown in figures 5A-5B. As can be seen, the sequencing data from all four polymerases consisting of a double amino acid substitution outperformed the single amino acid substitution polymerase (SEQ ID NO:34) in all observed metrics.
Example 9: evaluation of the Performance of double and triple amino acid substitution polymerase mutants in emulsion PCR
In this example, three polymerases each having one or more amino acid substitutions were prepared according to example 1 or example 2 and their performance in an emPCR reaction under high ionic strength conditions (e.g., 125mM KCl) was compared to Taq DNA polymerase (SEQ ID NO:34) having a single amino acid substitution.
Coli 500bp inserts were used to prepare nucleic acid libraries. Downstream application of nucleic acid libraries obtained from emPCR reactions to PGM using ion torrentTMIon-based sequencing reactions in a sequencing system (ion torrent system, part number 4462917).
Briefly, template DNA was purified, adapter ligated and size selected as described in the ion fragment library set user guide (ion torrent system, part number 4466464; published part number 4467320Rev B). Then according to Ion XpressTMTemplate set v 2.0 the protocol provided in the user guide (ion torrent system, part number 4469004a) and using an ion template preparation set (ion torrent system/life technologies, part number 4466461), an ion template reagent set (ion torrent system/life technologies, part number 446) 6462) And reagents provided in Ion template solution set (Ion torrent systems/Life technologies, part number 4466463) to amplify libraries of nucleic acid molecules to Ion spheresTMOn the particle (ion torrent system, part number 602-1075-01), except that the polymerase to be tested or the reference polymerase is used instead of the polymerase provided in the kit.
Subsequent loading of the amplified library into PGMTM318 in a sequencing chip. Loading of chips into ion torrent PGMTMSequencing was performed in a sequencing system (ion torrent system/life technologies, part number 4462917) and substantially according to the protocol provided in the ion sequencing kit v2.0 user guide (ion torrent system/life technologies, part number 4469714Rev a) and using reagents provided in the ion sequencing kit v2.0 (ion torrent system/life technologies, part number 4466456) and the ion chip kit (ion torrent system/life technologies, part number 4462923). The ion torrent system is a subsidiary of life technologies (carlsbad, california).
The resulting sequencing data obtained using the modified polymerase under high ionic strength emPCR was analyzed to measure AQ20 base number, raw accuracy, System Sequencing Error (SSE) and total sequencing throughput (AQ20 bases), among other metrics. Data for an exemplary sequencing operation performed as outlined in this example is shown in figure 6. As can be seen, based on all the observed metrics, the sequencing data from mutant polymerase "E397V" outperformed the single amino acid substitution polymerase (SEQ ID NO: 34). The "E397V" modified polymerase also outperformed both the double and triple amino acid substitution polymerases under the same emPCR conditions.
Example 10: comparison of the thermal stability of the modified polymerase with a reference polymerase
In this example, various modified polymerases containing one or more amino acid substitutions were prepared according to example 1 or example 2. A modified polymerase "E397V" consisting of a single amino acid substitution at amino acid residue 397, which is the numbering relative to the amino acid residue of SEQ ID No. 1, was prepared according to example 1. A modified polymerase "SEQ ID NO: 34" consisting of a single amino acid substitution compared to SEQ ID NO:1 was prepared according to example 1.
The modified polymerase "E794C + E805I" consists of double amino acid substitutions relative to the amino acid numbering of SEQ ID NO:1 and was prepared according to example 2. In addition, the modified polymerase "E397V + E745T" consists of double amino acid substitutions numbered relative to SEQ ID NO:1 amino acids and was prepared according to example 2.
The polymerases described above were each prepared as a PCR strip for thermostability testing for thermal cycling at 95 ℃ as follows: 15mM Tris pH 7.5, 100mM KCl, 30% trehalose, 0.1% NP40 (detergent) and 50nM polymerase (see FIG. 14).
The PCR strips were incubated at various time points of heat treatment (no heat control 0 min; 2 min; 4 min; 6 min or 8 min). After completion of the heat treatment at 95 ℃ or 96 ℃, the reaction mixture was placed on ice. The reaction mixture is then transferred to a culture plate for polymerase activity analysis.
Here, the polymerase activity assay was prepared as follows: combination 15mM Tris pH 7.5, 100mM KCl, 8mM MgCl2150nM oligo 221 and 5nM polymerase reaction mixture from the heat treatment step (10. mu.l). Oligomer 221 is a hairpin oligomer with a fluorescent dye attached (TTTTTTTGCAGGTGACAGGTTTTCCTGTCACCCXGC (SEQ ID NO:50) where X is a fluorescein-dT residue). Upon addition of dATP, the oligo 221 extends, causing fluorescence release (see Nikiforov, "analytical biochemistry" (2011)229-236, incorporated herein by reference in its entirety).
To initiate polymerase activity assays, 20 μ M dATP was added to each reaction. The change in fluorescence of each heat-treated polymerase was measured at each time point (0, 2, 4, 6, and 8 minutes) and plotted (see fig. 7-10). Here, the fluorescence signal at 525nm was measured using an excitation wavelength of 490nm over a certain period of time.
The thermostability of a variety of other single or dual amino acid substitution polymerases prepared according to example 1 or example 2 was also assessed as outlined in the examples.
FIG. 11 provides exemplary thermostability data obtained for a plurality of single or dual amino acid mutant polymerases compared to SEQ ID NO:34(TAQ LR1) at 95 ℃ in the presence of trehalose. The mutant polymerase "E397V" and the mutant polymerase "G418C" exhibited maximum thermostability at 95 ℃ under the test conditions. It should be noted that the amino acid residue numbers in FIGS. 11-14 represent residues that are mutated in the following manner: P6N, a77E, a97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, a502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A are indicated by two numbers separated by a slash line as double mutants with the above mutations at the indicated residues.
FIG. 12 provides exemplary thermostability data obtained for the same single or dual amino acid mutant polymerases as compared to WT Taq DNA polymerase (SEQ ID NO:1) (TAQ WT) and SEQ ID NO:34(TAQ LR1) at 96 ℃ in the presence of trehalose. Mutant polymerase "E397" and mutant polymerase "G418C" exhibited maximum thermostability at 96 ℃ under the test conditions.
Figure 13 provides exemplary thermal stability data obtained for the same single or dual amino acid mutant polymerases, as compared to WT Taq DNA polymerase (Taq WT) at 95 ℃ in the absence of trehalose (during the heat treatment step). Here, the mutant polymerase "E397V" and the mutant polymerase "G418C" exhibited thermal stability at 95 ℃ in the absence of trehalose superior to WT Taq (SEQ ID NO:1) under the test conditions.
It will be apparent to one of ordinary skill in the art that the foregoing thermal stability analysis is provided as an exemplary thermal stability analysis and is not meant to be limiting or limiting in any way. Thus, other variations of the thermostability assays provided herein or other forms of thermostability assays or other means of assessing residual polymerase activity are encompassed within the scope of the invention.
Sequence listing
<110> Life technologies Co
<120> polymerase compositions and methods of making and using the same
<130> LT00925PCT
<160> 44
<170> PatentIn version 3.5
<210> 1
<211> 832
<212> PRT
<213> Thermus aquaticus
<220>
<221> misc_feature
<223> wild type Taq polymerase
<400> 1
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 2
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: taq mutant E397V (Hit # 1)
<400> 2
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Val Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 3
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: (L763F + E805I)
<400> 3
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Phe Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Ile Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 4
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: (E397V + E745T + L763F)
<400> 4
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Val Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Thr Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Phe Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 5
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: P6N
<400> 5
Met Arg Gly Met Leu Asn Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 6
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: A77E
<400> 6
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Glu Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 7
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: A97V
<400> 7
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Val Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 8
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L193V
<400> 8
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Val Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 9
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: K240I
<400> 9
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Ile
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 10
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: R266Q
<400> 10
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Gln Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 11
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E267T
<400> 11
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Thr Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 12
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L287T
<400> 12
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Thr Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 13
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: P291T
<400> 13
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Thr Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 14
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: K292C
<400> 14
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Cys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 15
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E295F
<400> 15
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Phe Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 16
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E295N
<400> 16
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Asn Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 17
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: G418C
<400> 17
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Cys Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 18
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L490Q
<400> 18
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Gln Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 19
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: A502S
<400> 19
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ser Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 20
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: S543V
<400> 20
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Val Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 21
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: D578E
<400> 21
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Glu Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 22
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: R593G
<400> 22
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Gly Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 23
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L678F
<400> 23
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Phe Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 24
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L678T
<400> 24
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Thr Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 25
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: S699W
<400> 25
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Trp Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 26
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E713W
<400> 26
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Trp Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 27
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: V737A
<400> 27
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Ala Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 28
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E745T
<400> 28
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Thr Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 29
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L763F
<400> 29
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Phe Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 30
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E790G
<400> 30
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Gly Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 31
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E794C
<400> 31
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Cys Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 32
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: E805I
<400> 32
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Ile Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 33
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: L828A
<400> 33
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Ala Ser Ala Lys Glu
820 825 830
<210> 34
<211> 832
<212> PRT
<213> Artificial sequence
<220>
<223> Synthesis: (reference polymerase) - (Taq-LR 1) D732R
<400> 34
Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu
1 5 10 15
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly
20 25 30
Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala
35 40 45
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val
50 55 60
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly
65 70 75 80
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
85 90 95
Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys
115 120 125
Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Lys Asp
130 135 140
Leu Tyr Gln Leu Leu Ser Asp Arg Ile His Val Leu His Pro Glu Gly
145 150 155 160
Tyr Leu Ile Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro
165 170 175
Asp Gln Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn
180 185 190
Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu
195 200 205
Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu
210 215 220
Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys
225 230 235 240
Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val
245 250 255
Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe
260 265 270
Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu
275 280 285
Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly
290 295 300
Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp
305 310 315 320
Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro
325 330 335
Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350
Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
355 360 365
Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
370 375 380
Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu
385 390 395 400
Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu
405 410 415
Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu
420 425 430
Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly
435 440 445
Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
450 455 460
Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly His
465 470 475 480
Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
485 490 495
Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg
500 505 510
Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile
515 520 525
Val Glu Lys Ile Leu Gln Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr
530 535 540
Tyr Ile Asp Pro Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu
545 550 555 560
His Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser
565 570 575
Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590
Arg Ile Arg Arg Ala Phe Ile Ala Glu Glu Gly Trp Leu Leu Val Ala
595 600 605
Leu Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
610 615 620
Asp Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr
625 630 635 640
Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro
645 650 655
Leu Met Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly
660 665 670
Met Ser Ala His Arg Leu Ser Gln Glu Leu Ala Ile Pro Tyr Glu Glu
675 680 685
Ala Gln Ala Phe Ile Glu Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg
690 695 700
Ala Trp Ile Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val
705 710 715 720
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Arg Leu Glu Ala Arg
725 730 735
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750
Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu
755 760 765
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln Val His
770 775 780
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala
785 790 795 800
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro
805 810 815
Leu Glu Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Glu
820 825 830
<210> 35
<211> 6
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(2)
<223> Xaa can be any naturally occurring amino acid
<220>
<221> misc_feature
<222> (4)..(5)
<223> Xaa can be any naturally occurring amino acid
<400> 35
Asp Xaa Ser Xaa Xaa Glu
1 5
<210> 36
<211> 9
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(7)
<223> Xaa can be any naturally occurring amino acid
<400> 36
Lys Xaa Xaa Xaa Xaa Xaa Xaa Tyr Gly
1 5
<210> 37
<211> 4
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<400> 37
Val His Asp Glu
1
<210> 38
<211> 8
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(3)
<223> Xaa can be any naturally occurring amino acid
<400> 38
Asp Xaa Xaa Ser Leu Tyr Pro Ser
1 5
<210> 39
<211> 9
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(4)
<223> Xaa can be any naturally occurring amino acid
<220>
<221> misc_feature
<222> (7)..(7)
<223> Xaa can be any naturally occurring amino acid
<400> 39
Lys Xaa Xaa Xaa Asn Ser Xaa Tyr Gly
1 5
<210> 40
<211> 6
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<400> 40
Tyr Gly Asp Thr Asp Ser
1 5
<210> 41
<211> 6
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(5)
<223> Xaa can be any naturally occurring amino acid
<220>
<221> misc_feature
<222> (6)..(6)
<223> Xaa is Phe or Tyr
<400> 41
Asp Xaa Xaa Xaa Xaa Xaa
1 5
<210> 42
<211> 7
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(2)
<223> Xaa can be any naturally occurring amino acid
<220>
<221> misc_feature
<222> (4)..(6)
<223> Xaa can be any naturally occurring amino acid
<220>
<221> misc_feature
<222> (7)..(7)
<223> Xaa is Ser or Ala
<400> 42
Phe Xaa Gly Xaa Xaa Xaa Xaa
1 5
<210> 43
<211> 4
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(2)
<223> Xaa can be any naturally occurring amino acid
<400> 43
Tyr Xaa Asp Asp
1
<210> 44
<211> 9
<212> PRT
<213> Artificial sequence
<220>
<223> synthetic peptide
<220>
<221> misc_feature
<222> (2)..(8)
<223> Xaa can be any naturally occurring amino acid
<400> 44
Gly Xaa Xaa Xaa Xaa Xaa Xaa Xaa Lys
1 5
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种融合蛋白、碱基编辑工具及其应用