BLASTP 2.2.22 [Sep-27-2009] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. Reference for composition-based statistics starting in round 2: Schaffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Query= batch____ (305 letters) Database: uniref50.fasta 3,077,464 sequences; 1,040,396,356 total letters Searching..................................................done Results from round 1 Score E Sequences producing significant alignments: (bits) Value UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular ... 617 e-175 UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria ... 241 3e-62 UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular ... 236 1e-60 UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 224 3e-57 UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subty... 214 2e-54 UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococc... 211 3e-53 UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodofera... 206 9e-52 UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus ... 200 6e-50 UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria... 199 7e-50 UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomyc... 199 1e-49 UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicr... 194 3e-48 UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=A... 191 3e-47 UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomyc... 187 4e-46 UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacteriu... 185 2e-45 UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 183 8e-45 UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomy... 178 2e-43 UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 177 5e-43 UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=B... 174 2e-42 UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus cur... 165 1e-39 UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetel... 164 4e-39 UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiop... 162 9e-39 UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria... 146 9e-34 UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 143 6e-33 UniRef50_A3LCN8 Putative uncharacterized protein n=1 Tax=Pseudom... 138 2e-31 UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax... 138 3e-31 UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Ac... 109 1e-22 UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=N... 75 3e-12 UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria... 59 2e-07 UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonel... 57 1e-06 UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobac... 48 5e-04 UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma... 47 0.001 UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfoto... 45 0.002 UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiotherm... 45 0.003 UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular ... 44 0.006 UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacteri... 44 0.009 UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=... 44 0.010 UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfuro... 43 0.012 UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscil... 43 0.014 UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwen... 42 0.032 UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=M... 42 0.036 >UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular organisms RepID=YGBT_ECOLI Length = 305 Score = 617 bits (1592), Expect = e-175, Method: Compositional matrix adjust. Identities = 305/305 (100%), Positives = 305/305 (100%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV Sbjct: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF Sbjct: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 Query: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI Sbjct: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP Sbjct: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 Query: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA Sbjct: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 Query: 301 GHRSS 305 GHRSS Sbjct: 301 GHRSS 305 >UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria RepID=Q3ZZ81_DEHSC Length = 309 Score = 241 bits (614), Expect = 3e-62, Method: Compositional matrix adjust. Identities = 118/280 (42%), Positives = 180/280 (64%), Gaps = 2/280 (0%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 L+ +P +DR S ++L+ G++DV + + + +P+ + +ML PG+ V+HAA Sbjct: 4 LHELPRFRDRWSYLYLEMGRLDV-EADSLGFHQGDTVVPVPIDQLGVVMLGPGSTVTHAA 62 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ +Q L+ W G+ GVR+YA+ G + +L+ QA+L D++ RL+V +M+ RF Sbjct: 63 IKSLSQNNCLIAWTGQDGVRLYAASIGGTYSARRLIRQARLVSDDEKRLEVAWRMYRFRF 122 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 E P S+E +RG+EG RVR YA +++YGV W GR YD KDW KGD IN+ +SAA Sbjct: 123 NEVIPPVVSLESIRGMEGIRVRRAYAKASQEYGVEWKGRHYDQKDWSKGDPINRALSAAN 182 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +CLYG+ A IL+AGY+ A+GFVHTGK LSFVYD+AD+ K + +P AF++A NP + + Sbjct: 183 ACLYGICHAGILSAGYSSALGFVHTGKMLSFVYDVADLYKTELTIPVAFKVAAANPTDLE 242 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPE 284 R+VR+ CR+ F K L +L+ I +VL + +P E Sbjct: 243 RQVRIECREAFYEFKLLERLLTDIAEVLGVSDDIGESPDE 282 >UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular organisms RepID=D1CGD6_THET1 Length = 324 Score = 236 bits (601), Expect = 1e-60, Method: Compositional matrix adjust. Identities = 120/278 (43%), Positives = 170/278 (61%), Gaps = 4/278 (1%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + D S +++++ +ID A L D TG +T +P S++ +ML PGT ++HAA++ A Sbjct: 12 VSDSWSYLYVEHCRIDQDARAISLHDATG-KTMVPCASLSLLMLGPGTSITHAAIQTLAD 70 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G L+ WVGE GVR YA G + L QA L D +L L+VVR+M+E+RF P Sbjct: 71 NGCLVAWVGEEGVRFYAQGMGETRSATNTLRQAMLWSDPELHLQVVRRMYEIRFRHPINP 130 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 S++Q+RG+EG+RVR Y L+++ GV W GR Y K W D IN+ ISAA SCLYGV Sbjct: 131 NTSLKQIRGMEGARVRGAYLQLSRETGVEWKGRDYSSKSWHSNDAINRAISAANSCLYGV 190 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 AAI++AGY+ A+GF+HTGK LSFVYD+AD+ K + +P AF + VR Sbjct: 191 CHAAIVSAGYSTALGFIHTGKMLSFVYDVADLYKTEISMPAAFYAVAEGGASLESRVRRK 250 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQP 288 CRDI R ++ LA+++ I+ VL + P P + P Sbjct: 251 CRDILRETRLLARIVEDIDTVL---NVDSPIPHKYQNP 285 >UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=D1CAI8_SPHTD Length = 314 Score = 224 bits (570), Expect = 3e-57, Method: Compositional matrix adjust. Identities = 117/288 (40%), Positives = 175/288 (60%), Gaps = 6/288 (2%) Query: 4 LPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 + L+ +P ++D S +++++ +I+ A + D G+ +P S+ +ML PGT +SH Sbjct: 1 MDLHILPKVRDSWSYLYVEHARIEQEAKAIAIHDAVGM-VPVPCASLGILMLGPGTSISH 59 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AA+R A+ G L++W GE GVR YA G + L+ QA+L D LRL+VV +M+++ Sbjct: 60 AAIRTLAENGCLVLWTGEEGVRFYAQGLGETRSARNLMRQARLWADPALRLRVVFRMYQM 119 Query: 123 RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 RF EP P +++Q+RG+EG+RVR YA +++ GV W GR + ++W D IN+ +S Sbjct: 120 RFSEPLPPDLTLQQIRGMEGARVRDAYARASRETGVPWRGRSFQRRNWSATDPINRALSC 179 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 A SCLYG+ AAI++ GY+P +GF+HTGK LSFVYDIAD+ K +P AF + + Sbjct: 180 ANSCLYGICHAAIVSLGYSPGLGFIHTGKMLSFVYDIADLYKATVTIPLAFRVVAEGTHD 239 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDA 286 + VR ACRD F + + L + IE VL A G P EDA Sbjct: 240 LEGRVRRACRDAFVAHRLLGTIATDIEHVLDISDADGGADEPDFDEDA 287 >UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subtype n=3 Tax=Bacteria RepID=D2RB04_GARVA Length = 313 Score = 214 bits (546), Expect = 2e-54, Method: Compositional matrix adjust. Identities = 105/264 (39%), Positives = 158/264 (59%), Gaps = 3/264 (1%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + DRVS I++++ +I+ +D A + D G +P + ++L PGT ++H A+ L Sbjct: 18 ISDRVSFIYVEHAKINRLDSAVTVFDANGT-IRVPAAMIGVLLLGPGTEITHRAMELLGD 76 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 VG +VWVGE GVR YA G+ S L Q+KL + RL V RKM+++RF + Sbjct: 77 VGASIVWVGEHGVRNYAHGRALSRSSRLLEKQSKLVTNSRSRLNVARKMYQMRFPNENVS 136 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 +++QLRG EG+RVR Y ++ +Y V WNGR Y D+E G +N+ +S CLYG+ Sbjct: 137 SYTLQQLRGREGARVRHLYREMSNKYNVQWNGRDYKVNDFESGTVVNKALSVGNVCLYGL 196 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE--VR 248 + I A G AP +GFVHTG LS VYDIAD+ K + +P +FEIA R + D E +R Sbjct: 197 VHSIISALGLAPGLGFVHTGHDLSLVYDIADLYKAELTIPASFEIAARCESDDDIEQLMR 256 Query: 249 LACRDIFRSSKTLAKLIPLIEDVL 272 L RD F + +++++ I+++L Sbjct: 257 LKMRDCFANCNIMSRIVNDIQNLL 280 >UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococcales RepID=Q2JWC7_SYNJA Length = 315 Score = 211 bits (536), Expect = 3e-53, Method: Compositional matrix adjust. Identities = 112/299 (37%), Positives = 175/299 (58%), Gaps = 6/299 (2%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 L IP ++D +S ++++ +I+ A ++ + G R IP S+ +ML PGT ++HAA Sbjct: 10 LRSIPKVRDSISFVYVERCRIEQDAKAIAVLQEDG-RYIIPCASLTTLMLGPGTAITHAA 68 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ A + WVGE G+R YASG + ++L +QAKL D ++VVR+M+ RF Sbjct: 69 IKNLADGLCSVQWVGEDGLRFYASGSHPSSSVERLYHQAKLWADPVQHMEVVRRMYSFRF 128 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 EP ++EQ+RG+EG RVR Y+ L+K+ GV W GR Y K+WE D +N+ +S A Sbjct: 129 PEPLKEGLTLEQIRGLEGVRVRTVYSRLSKETGVNWKGRSYKLKEWECADPVNRALSVAN 188 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +CLY V +AA+ A GY+ A+GF+H GKPLSFVYD+AD+ K + +P AF+ A + Sbjct: 189 TCLYAVCQAALNAVGYSTALGFIHIGKPLSFVYDVADLYKTEITIPVAFKAAAELMPNFE 248 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 R CR+ F + + ++I ++ +L Q + P D + A+ VS G+ Sbjct: 249 SRTRQLCREKFVEHRLMQRIIDDVDAILGFRATQEESSPVGSLWDNEKGAVEGGVSYGE 307 >UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodoferax ferrireducens T118 RepID=Q21QB1_RHOFD Length = 277 Score = 206 bits (524), Expect = 9e-52, Method: Compositional matrix adjust. Identities = 105/261 (40%), Positives = 154/261 (59%), Gaps = 9/261 (3%) Query: 12 KDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 K+R+ +FL+ G + V DG +L+ + IP V+C+M+EPG V+H A++L + Sbjct: 18 KNRIPYLFLEKGILRV-DGHCLLLCQAESAIEIPGSMVSCLMIEPGVSVTHEAMKLCGEN 76 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 GTLL+WVGE G R YA+ + ++L QA + ++ R+ +++ L F + P Sbjct: 77 GTLLMWVGEGGTRFYAAAH-AHQDASRVLRQAAIHTNQRERIAAASRLYGLMFDDHMPPS 135 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 ++E+LRG+EGSRV+ Y LA + G+ W GR E+ +N I ATSCLY + Sbjct: 136 FTIEKLRGLEGSRVKEIYVNLADKLGMVWQGR-------EEKSALNTSIGFATSCLYALC 188 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC 251 E AILAAGY P IG VH+G P S V+D+AD +KF TVVP AFEIA +P + VR C Sbjct: 189 EVAILAAGYHPGIGVVHSGNPRSLVFDLADTVKFKTVVPLAFEIAATSPSNLNMAVRHGC 248 Query: 252 RDIFRSSKTLAKLIPLIEDVL 272 RD+F L+ +E++ Sbjct: 249 RDLFSRESMFETLLGHLENIF 269 >UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus RepID=Q03C58_LACC3 Length = 315 Score = 200 bits (508), Expect = 6e-50, Method: Compositional matrix adjust. Identities = 105/258 (40%), Positives = 161/258 (62%), Gaps = 5/258 (1%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 +++RV+ ++L++ +I+ D A V+ID TG IP ++ +ML PG V+H A+ L Sbjct: 16 VRERVTFLYLEHAKINRQDSAIVVID-TGGTVAIPAALISVLMLGPGVDVTHRAMELMGD 74 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G +VWVGE GVR YA G+ S L+ QAKL + LR+ V R+M+++RF + + Sbjct: 75 AGMSVVWVGERGVRQYAPGRALTHSSALLVAQAKLVSNNRLRVGVARQMYQMRFPDDDVS 134 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 S+++LRG EG+RVR Y +++ GV W R YDP++++ G INQ ++AA + LYG+ Sbjct: 135 TLSMQELRGKEGARVRRIYREESRRTGVEWTHREYDPENYQSGSIINQALTAAHAALYGL 194 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD---REV 247 + + I+A G +P +GFVHTG LSFVYD AD+ K + +P AF +A N E D + Sbjct: 195 SYSVIVALGASPGLGFVHTGHDLSFVYDFADLYKAEVTIPIAFTVA-ANATEQDDIGQLT 253 Query: 248 RLACRDIFRSSKTLAKLI 265 RLA RD F K + +++ Sbjct: 254 RLAVRDAFVDGKLMIRMV 271 >UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria RepID=D1NTI3_9BIFI Length = 366 Score = 199 bits (507), Expect = 7e-50, Method: Compositional matrix adjust. Identities = 102/267 (38%), Positives = 155/267 (58%), Gaps = 1/267 (0%) Query: 12 KDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +DR++ ++ ++ ++ + A + D G+R HIP +++ +ML PGT V+H A+ + Sbjct: 37 EDRLTFLYFEHCVVNRDNNAITVTDDRGVR-HIPAAALSVLMLGPGTSVTHQAMMVIGDN 95 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 G ++WVGE GVR Y SG+P S+ L QA+L + RL V R M+ +RF + Sbjct: 96 GATVIWVGERGVRTYCSGKPLTHSSNLLQKQAQLVTNMRKRLSVARAMYAMRFPHEDVSN 155 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 +++QLRG EG+RVR Y +KQ GV W R Y P+D+ D INQ +SAA CLYG+ Sbjct: 156 LTMQQLRGREGARVRRVYRHWSKQTGVRWERRDYRPEDFADSDRINQALSAANICLYGIA 215 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC 251 A I+A G +P +GFVHTG LSFVYD+AD+ K + +P AF+ A + VR A Sbjct: 216 HAVIVALGCSPGLGFVHTGHELSFVYDMADLYKAELSIPVAFKTAATEVDDIGGAVRRAM 275 Query: 252 RDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 RD + +++ I + A + + Sbjct: 276 RDAMYDLSIMPRMVKDIHHLFDAADAE 302 >UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomycetales RepID=A8M406_SALAI Length = 322 Score = 199 bits (505), Expect = 1e-49, Method: Compositional matrix adjust. Identities = 114/286 (39%), Positives = 160/286 (55%), Gaps = 5/286 (1%) Query: 12 KDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +DR+S ++L+ I A D+ GI HIP ++ +ML PGT ++ A+ L A Sbjct: 18 QDRISFVYLERCVIHRDSNAITATDEKGI-VHIPAATLGVLMLGPGTSITQQAMMLIADN 76 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 G +VW+GE GVR YA G+P S L+ QA D RL+V R M+ +RF Sbjct: 77 GATVVWIGEHGVRYYAHGRPLARSSRLLVAQAAAVSHRDRRLRVARAMYRMRFPGEDTTN 136 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 +++QLRG EG+RVR Y A++ GV+WN R YDP D+ D +NQ +SAA +CLYG+ Sbjct: 137 LTMQQLRGKEGARVRRCYRENAQRTGVSWNSREYDPDDFTGSDPVNQALSAAHACLYGIV 196 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC 251 A ++A G +P +GFVHTG SFVYDIAD+ K D +P AF+IA + + R A Sbjct: 197 HAVVVAVGASPGLGFVHTGHDRSFVYDIADLYKADVTIPVAFDIAAAESTDIGADTRRAV 256 Query: 252 RDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDAQPVAIPL 293 RD + L + + I +L AAG I E+A A+ L Sbjct: 257 RDRVHNGALLGRCVQDIRRLLLTDSAAGPINEEEFDEEADNDAVRL 302 >UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicrobium ferrooxidans DSM 10331 RepID=C7LYW4_ACIFD Length = 314 Score = 194 bits (493), Expect = 3e-48, Method: Compositional matrix adjust. Identities = 99/265 (37%), Positives = 159/265 (60%), Gaps = 3/265 (1%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 P+ R S ++L++ + A V + ++G T++P +V ++L PGTR++H A+ L Sbjct: 20 PVSRRSSFVYLEHCVVHRDANAVVSVTESGT-TYLPAAAVGTLLLGPGTRITHQAMLLLG 78 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKMFELRFGEPA 128 + G ++ WVGE R+YA P +S + L QA+L + RL+V R+M+++RF Sbjct: 79 ESGVVVCWVGEGDTRLYAWA-PSLFQSTRFLEAQARLVSNRQDRLRVARQMYQMRFPGED 137 Query: 129 PARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLY 188 ++ ++++LRG+EG+R+R TY LA +G+ W+GR YDP + GD +N+ +S A S LY Sbjct: 138 VSKATMQRLRGMEGARIRRTYRHLASAFGIDWHGRHYDPNNSSAGDDVNRALSIANSVLY 197 Query: 189 GVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVR 248 GV AI+A G +P +GFVHTG LSFVYD+AD+ K + +P AFE A + G +VR Sbjct: 198 GVVHTAIVALGCSPGLGFVHTGHSLSFVYDVADLYKVELAIPVAFEAAAQRTGSLSSQVR 257 Query: 249 LACRDIFRSSKTLAKLIPLIEDVLA 273 R+ + L + + I +L Sbjct: 258 RTMRERIHEAHLLERAVDDIRLLLG 282 >UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=Actinomycetales RepID=C7MTM5_SACVD Length = 342 Score = 191 bits (485), Expect = 3e-47, Method: Compositional matrix adjust. Identities = 110/274 (40%), Positives = 163/274 (59%), Gaps = 5/274 (1%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 L DRVS ++++ +D + A +I++ +P VA ++L PGTRV+H A++L A Sbjct: 18 LTDRVSSVYIERSHLDRAENAIAIINRRET-VRLPAALVAVVLLGPGTRVTHGAMQLLAD 76 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKMFELRFGEPAP 129 GT + WVGE GVR+YA+G G +R LL QA L RL+V R M+ +RF Sbjct: 77 SGTAVCWVGEQGVRMYAAGL-GPSRGAALLQRQAYLVSRTTTRLEVARAMYAMRFPGEDV 135 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD-WEKGDTINQCISAATSCLY 188 + +++QLRG EG+RVR Y A+Q+GV WNGR Y D + GD +N+ +SAA + LY Sbjct: 136 STLTMQQLRGREGARVRKVYRQQARQHGVPWNGRAYKAGDAFAVGDDLNRLLSAANAALY 195 Query: 189 GVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVR 248 G+ A I+ G +P +GF+HTG SFV DIAD+ K + +P AF++A R E +R+ R Sbjct: 196 GICHAVIVGLGASPGLGFIHTGSATSFVMDIADLYKAEYTIPLAFQLAARGLLE-ERDAR 254 Query: 249 LACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAP 282 A RD + L ++I ++ +LA + P P Sbjct: 255 TALRDRIAGTGLLPRIIKDVKTLLAPEGVDLPDP 288 >UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomycetales RepID=D1A5U2_THECD Length = 315 Score = 187 bits (475), Expect = 4e-46, Method: Compositional matrix adjust. Identities = 105/297 (35%), Positives = 155/297 (52%), Gaps = 7/297 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + DR+S I+L+ + D A D GI THIP ++ C++L PGTRV+H A Sbjct: 12 PRELTRMSDRISFIYLERCTLHREDNAITAEDADGI-THIPSATIGCLLLGPGTRVTHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + G +VWVGE GVR Y+ G+ S + QA + RL+V R M+ +RF Sbjct: 71 MSVLGDSGAGVVWVGEQGVRFYSGGRSLTRSSALVEAQAIKWANRRTRLEVARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + PA + ++L G EG RV+ Y A +YG+TW GR Y P D+ D +NQ ++AA Sbjct: 131 PDEDPAGLTRQELLGREGRRVKERYRQEAAKYGITWKGRHYIPGDFGSSDPVNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ + + A G +P +GF+H+G L+FV DIAD+ K + +P AF +P + Sbjct: 191 QCMYGIAQTTVAALGCSPGLGFIHSGHELAFVLDIADLYKTEFALPIAFRTVAESPEDVG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDAG 301 R A RD L + + I+ +L P P+D I GD G Sbjct: 251 SRTRRAIRDEVNRVGLLRRCVDDIKSLLL------PDVPDDPLNSDIDQVTLQGDHG 301 >UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacterium RepID=B1VIX8_CORU7 Length = 312 Score = 185 bits (470), Expect = 2e-45, Method: Compositional matrix adjust. Identities = 93/269 (34%), Positives = 154/269 (57%), Gaps = 1/269 (0%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + DR+S ++++ + A + D+ G+ H+P +A ++L GTR+++AA+ L Sbjct: 17 MGDRISFLYVERAVVSRDGNALTITDQRGV-AHVPATQLAVLLLGTGTRITNAAMALLGD 75 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G VWVGE GVR YA G+P S QA++ ++ RL+ R+M+ LRF + Sbjct: 76 CGVSTVWVGERGVRYYAHGRPPAKSSRLAELQARVVTNQRKRLECARRMYGLRFPGEDVS 135 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + ++ QLRG EG+R++ YA AK+ GV WN RRYDP D++ D INQ ++ ++ LYG+ Sbjct: 136 KLTMAQLRGREGARMKRLYAAEAKRTGVAWNRRRYDPNDYDSSDPINQALTTGSAALYGI 195 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A I+ G+ PA+G +HTG SFVYD+AD+ K + +P AF + VR Sbjct: 196 AHAVIVGLGFVPALGVIHTGTDRSFVYDVADLYKAEVSIPAAFNAVASGTEDVGPMVRRL 255 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 RD + + +++ ++ V++ + +P Sbjct: 256 VRDAVVEQRLMPRMVRDLKFVMSVPDDEP 284 >UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=C7QEM2_CATAD Length = 323 Score = 183 bits (464), Expect = 8e-45, Method: Compositional matrix adjust. Identities = 94/269 (34%), Positives = 150/269 (55%), Gaps = 1/269 (0%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + DRVS ++L+ + A D GI THIP ++ ++L PGTR++H A+ + Sbjct: 18 IADRVSFVYLERCTVHRDANAITAQDADGI-THIPSATIGTLLLGPGTRITHQAMAVLGD 76 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G + WVGE G R YA+ + S + QA L + RL + R M+ +RF + P+ Sbjct: 77 CGANVAWVGEHGARFYAAARSLNRSSALVEAQATLWANRRTRLDIARAMYRMRFPDEDPS 136 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 +QL G+EG R++ Y +++ GV W+GR+Y P ++ GD INQ I+AA C+YGV Sbjct: 137 GFMRQQLLGMEGRRLKDCYRQQSQRTGVPWHGRQYTPGNFNAGDAINQAITAAAQCMYGV 196 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 I A G +P +GF+H+G LSFV DIAD+ K + +P AF+ A + + R A Sbjct: 197 AHTIITALGCSPGLGFIHSGHELSFVMDIADLYKTEIGIPVAFDTAAEDSTDIGPRTRRA 256 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 R+ R+++ L + + ++ +L +P Sbjct: 257 LREQIRTTRLLERCVDDVKALLTTPNNEP 285 >UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomyces RepID=B4V4P5_9ACTO Length = 315 Score = 178 bits (451), Expect = 2e-43, Method: Compositional matrix adjust. Identities = 99/278 (35%), Positives = 149/278 (53%), Gaps = 6/278 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G THIP ++ ++L PGTR++H A Sbjct: 12 PRELTRVAERISFVYLERCVVHRDANAITAEDADGT-THIPSATIGTLLLGPGTRITHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + A+ G + WVGE GVR YA G+ S + QA L + RL+V R M+ LRF Sbjct: 71 MSVLAESGAAVAWVGEQGVRYYAGGRALSRSSALVEAQATLWANRRTRLEVARAMYRLRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + P+ + +L G EG RV+ Y A + GV W GR Y P D+ GD NQ ++AA Sbjct: 131 PDEDPSGLTRRELLGHEGYRVKECYRHQADRTGVPWRGRHYVPGDFTAGDAPNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ A + A G A +GFVH+G LSFV D+AD+ K + +P AF++A + + Sbjct: 191 QCMYGIAHAVVAALGCATGLGFVHSGHELSFVLDVADLYKTEIGIPVAFDVAAESTEDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAP 282 R A RD ++ L + + I+ +L QP P Sbjct: 251 SRTRRALRDAVNKNRLLDRCVNDIKLLL-----QPEGP 283 >UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=C7MTL6_SACVD Length = 328 Score = 177 bits (448), Expect = 5e-43, Method: Compositional matrix adjust. Identities = 90/257 (35%), Positives = 146/257 (56%), Gaps = 8/257 (3%) Query: 41 RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLL 100 R ++PV +++CI+ GT V+ A+ A+ T ++W G GVR+Y+ ++ L Sbjct: 57 RVYLPVAAISCILFGTGTSVTQPAMATCARHNTTVLWTGSGGVRMYSGSLAPNLTTEWLE 116 Query: 101 YQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW 160 Q + D+ RL V +M+ +RFG PA S+ LRG+EG R++A Y LA ++G+ Sbjct: 117 RQVRAWADDSTRLAVAARMYSMRFGAEVPAGTSLNTLRGLEGQRMKALYRSLADRHGLRG 176 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIA 220 R YDP +W + + +NQ +SAA + LYG +A+LA G +PA+GF+H+GK SFVYD+A Sbjct: 177 FKRNYDPANWGEQNPVNQALSAANTALYGAVHSALLALGCSPALGFIHSGKQHSFVYDVA 236 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 D+ K +P AF + + +PDREVR+ R F + + +++ ++ +L P Sbjct: 237 DLYKAKHTIPLAFALHK--SAQPDREVRIRMRQDFHLYRLMPRIVRDVQRLL------DP 288 Query: 281 APPEDAQPVAIPLPVSL 297 + +D P V L Sbjct: 289 SIAQDHDETGEPEEVEL 305 >UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=Bacteria RepID=C4X9I5_KLEPN Length = 294 Score = 174 bits (442), Expect = 2e-42, Method: Compositional matrix adjust. Identities = 91/258 (35%), Positives = 142/258 (55%), Gaps = 3/258 (1%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 +KD+ ++L+ G++++ D + +D G +PV ++ ++L PGT V+H A++ A Sbjct: 22 VKDKYPFLYLERGRLEIDDSSVKWVDADGNVVPLPVATINTLLLGPGTTVTHEAIKTATA 81 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 + WVGE + YA+G A + L Q LA D LKV R MF RF + Sbjct: 82 ANCAVCWVGEDSLLFYAAGFLPTADTRNLKAQMALACDASSTLKVARAMFAKRFPDADLE 141 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 +S+ + G+EGSRVRA Y A++YGV W GR++ P +E D NQ +++ + LYG+ Sbjct: 142 GKSLNSMMGMEGSRVRALYQQKAQEYGVGWKGRQFTPGKFELSDLTNQVLTSTNAALYGI 201 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 + + A GY+P IGF+H+G PL FVYD+AD+ K + AF ++R G D+ Sbjct: 202 LCSVVHAMGYSPHIGFIHSGSPLPFVYDLADLYKERLCIDLAFSLSREMAGRYDKH---K 258 Query: 251 CRDIFRSSKTLAKLIPLI 268 + FR L+ LI Sbjct: 259 VSEAFRKRVIALDLLNLI 276 >UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus curtisii ATCC 43063 RepID=C2BS02_9ACTO Length = 314 Score = 165 bits (418), Expect = 1e-39, Method: Compositional matrix adjust. Identities = 87/262 (33%), Positives = 148/262 (56%), Gaps = 1/262 (0%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 ++DR+S ++++ ++ A + D GI HIP V ++L PGT+V++AA+ L Sbjct: 18 MEDRLSFLYVERAILNREGNALTIQDSRGI-AHIPATQVGVVLLGPGTKVTYAAMALLGD 76 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G VWVGE GVR YA G+P S A+L ++ RL+ R+M+ +RF + Sbjct: 77 AGCSAVWVGEKGVRYYAHGRPAAKTSRMAEAHARLWANQRSRLRCARRMYSMRFPGEDVS 136 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + QLRG EG+R++ YA +++ GV W R YDP D+ GD IN ++ + LYG+ Sbjct: 137 NLPLSQLRGREGARMKRIYAEQSRRTGVPWTRRSYDPNDFGAGDPINCALTEGAAALYGI 196 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A ++ G+ P++G +H+G +FVYD+AD+ K + +P AFE + + VR Sbjct: 197 AHAVVVGLGFIPSLGIIHSGTDRAFVYDVADLYKAEISIPAAFEAVAASAEGDELNVRKR 256 Query: 251 CRDIFRSSKTLAKLIPLIEDVL 272 RD +++ + +++ ++ V+ Sbjct: 257 IRDKVVTTRLMQRMVRDLQYVM 278 >UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetella anthropi E3_33 E1 RepID=C9M9R9_9BACT Length = 281 Score = 164 bits (415), Expect = 4e-39, Method: Compositional matrix adjust. Identities = 99/270 (36%), Positives = 151/270 (55%), Gaps = 12/270 (4%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTH-----IPVGSVACIMLEPGTRVSHAAVRLAAQV 71 M++L+ G + V DG + G IP +V+ I+LEPGT ++H RL Q Sbjct: 1 MLWLERGNLFVKDGTLRFVSAGGGSLEKGTYDIPYQNVSMIVLEPGTTITHDVFRLMGQQ 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 GT L+ VG+ GVR Y + G RS Q +L + RL+V M+ +RFGE P R Sbjct: 61 GTGLIAVGDKGVRCYTAPPLGPDRSALARRQVELWANPQTRLQVALAMYAIRFGEELPTR 120 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 + +E LRGIEG+R+R +Y++LAK YG+TW RR++ K K D IN ++ A S +YG Sbjct: 121 K-IEDLRGIEGARLRKSYSILAKFYGLTWTLRRFNRKQPNKTDDINAAVNHAASAMYGAA 179 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP----DREV 247 + A+ A P +GFVH +F DIAD+ + + +P AF EP +R V Sbjct: 180 DIAVAAVSAIPQLGFVHAKSCRAFALDIADLYRTEITLPAAFRGLASYLEEPGMDLERHV 239 Query: 248 R-LACRDIFRSSKTLAKLIPLIEDVLAAGE 276 R L ++++R K ++K+I I++++ G+ Sbjct: 240 RKLIGQELYR-QKVISKMIDQIKELILHGQ 268 >UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111 RepID=C1YVP5_NOCDA Length = 327 Score = 162 bits (411), Expect = 9e-39, Method: Compositional matrix adjust. Identities = 101/288 (35%), Positives = 148/288 (51%), Gaps = 4/288 (1%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G R ++P ++ ++L PGT V+H+A Sbjct: 12 PRELTRVGERLSFLYLERCVVHRDSNAITAEDGDGTR-YLPSATIGTLLLGPGTNVTHSA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKMFELR 123 + L + G +VWVGE GVR YA+G+ RS +L+ QA + RL V R M+ +R Sbjct: 71 MSLLGECGATVVWVGEHGVRYYAAGRAL-TRSSRLVEAQATAWANRRSRLDVARAMYRMR 129 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 F + S + L G EG RV+A Y A + GVTW GRRY P D + D N+ I+AA Sbjct: 130 FPDLDVEALSRQALLGKEGDRVKACYREQAARTGVTWRGRRYVPGDHDVSDPPNKAITAA 189 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 C YGV A A G +P +GFVH+G FV D+AD+ K + +P AF+ A + + Sbjct: 190 AQCFYGVAHAVTAALGCSPGLGFVHSGHERGFVMDVADLYKVEIGIPVAFDAAAQGDEDV 249 Query: 244 DREVRLACRDIFRSSKTLAKLIPLIED-VLAAGEIQPPAPPEDAQPVA 290 D R RD L + + I+ +L G + EDA A Sbjct: 250 DGVTRRLLRDRINEEGLLERCVRDIKALLLGEGSVGAQGEAEDAGESA 297 >UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria RepID=Q0AA34_ALHEH Length = 298 Score = 146 bits (368), Expect = 9e-34, Method: Compositional matrix adjust. Identities = 91/277 (32%), Positives = 143/277 (51%), Gaps = 10/277 (3%) Query: 9 IPLKDRVSMIFLQYGQIDVIDGAFVLI-----DKTGIRTHIPVGSVACIMLEPGTRVSHA 63 IP DR +++L G++ V DG D IP ++ I+L PG+ V+H Sbjct: 18 IPHVDRHGLLWLTRGRLYVEDGTLHFTAAESEDLAAGDYAIPYQGLSMILLGPGSTVTHD 77 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 +RL A+ GTLL +G G + Y + G RSD A L ++ RL V R+M+ R Sbjct: 78 VLRLLARHGTLLAAIGGGGTKYYTAPPMGQGRSDVARRHATLWANKTQRLDVARRMYAFR 137 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 FG P + + LRGIEG R++ Y + A ++G+ W GRRY+ + D NQ I+ A Sbjct: 138 FGRVLP-HKDIAVLRGIEGGRIKELYRVEASRFGIPWKGRRYNRNNPSAADVPNQAINHA 196 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 + + + A+ A G P +GF+H +F DIAD+ + + VP AF+ AR+ +P Sbjct: 197 ATFVEAAADIAVAATGALPPLGFIHEESSNAFTLDIADLYRGEITVPLAFQAARKVLDDP 256 Query: 244 ----DREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 +R +R F+ K + K+I I+D++ A + Sbjct: 257 TLSIERTLRRDAASAFQRHKVIPKMIDRIKDLINADD 293 >UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=Q47PJ6_THEFY Length = 332 Score = 143 bits (361), Expect = 6e-33, Method: Compositional matrix adjust. Identities = 86/241 (35%), Positives = 136/241 (56%), Gaps = 8/241 (3%) Query: 11 LKDRVSMIFLQYGQIDVID-GAFVLIDKTGIRTH---IPVGSVACIMLEPGTRVSHAAVR 66 + D +S +++ +I D G ++ R H IP S+AC++L PGT ++ A+ Sbjct: 19 VSDGLSFLYVDVCRIVQTDTGVCAEVETETGRIHRVPIPTASLACVLLGPGTSITSPAMA 78 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLL-YQAKLALDEDLRLKVVRKMFELRFG 125 + T +V G G+ Y S P R+ K + QA+ D+ R V +M+E+RFG Sbjct: 79 TFMRHNTTVVTCGAGGILNYGSF-PAPNRTTKWIDRQARAYSDDRRRRDVAVRMYEMRFG 137 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 E P S+E+LR +EG+R++A Y LA + V R Y+P DW+ D +N+ +SA+ + Sbjct: 138 EEPPPGASIERLRQLEGARMKALYRSLAAKNRVKPFKRNYNPHDWDDQDPVNKALSASNA 197 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 LYGV + + G PA+GF+H+GK +FVYDIAD+ K T +P AF ++R P++ Sbjct: 198 ALYGVVHSVLAHLGCHPALGFIHSGKQDAFVYDIADLYKARTTIPLAFSLSRT--ANPEQ 255 Query: 246 E 246 E Sbjct: 256 E 256 >UniRef50_A3LCN8 Putative uncharacterized protein n=1 Tax=Pseudomonas aeruginosa 2192 RepID=A3LCN8_PSEAE Length = 112 Score = 138 bits (348), Expect = 2e-31, Method: Compositional matrix adjust. Identities = 64/78 (82%), Positives = 73/78 (93%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 PL P+P+KDR+SM+F+QYGQIDV DGAFV+ID+TG+R HIPVGSVACIMLEPGTRVSHAA Sbjct: 4 PLKPLPMKDRLSMVFVQYGQIDVRDGAFVVIDQTGVRMHIPVGSVACIMLEPGTRVSHAA 63 Query: 65 VRLAAQVGTLLVWVGEAG 82 V LA+ VGTLLVWVGEAG Sbjct: 64 VHLASTVGTLLVWVGEAG 81 >UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax=Rhodospirillum centenum SW RepID=B6IWM1_RHOCS Length = 281 Score = 138 bits (347), Expect = 3e-31, Method: Compositional matrix adjust. Identities = 85/270 (31%), Positives = 138/270 (51%), Gaps = 6/270 (2%) Query: 9 IPLKDRVSMIFLQYGQIDVIDGAFVL-IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRL 67 IP K R +I+++ ++ + +G+ V+ D G +P + ++L PG+ ++H AVR Sbjct: 13 IPQKSRNGLIYVERCRLSIDNGSLVIAFDDRGEELELPYQRLNAVLLGPGSSITHDAVRH 72 Query: 68 AAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP 127 + GT L +VG G R+Y + S QA E R+ V ++M+ RFGE Sbjct: 73 CSGHGTCLAFVGSDGTRLYTAPPLFDRDSTLARQQATWWAGESTRIMVAKRMYAKRFGE- 131 Query: 128 APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCL 187 P S++ LRG+E +R+R +Y L+A Q G+ W GRR+D D + D NQ I+ + + Sbjct: 132 TPRATSLDSLRGMEAARIRHSYELIAAQAGIVWRGRRFDRSDPDGDDLPNQAINHVVTAV 191 Query: 188 YGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR-NPGEPDRE 246 A+ A G P +GF+H S+ DI D+ + VP AF +R + G D Sbjct: 192 EACVAIAVQATGTLPPLGFLHEDSAKSWTLDICDLYRTSVTVPLAFRCVKRIDQGATDSL 251 Query: 247 VRLACRDI---FRSSKTLAKLIPLIEDVLA 273 R+ R + R + + +I I++VLA Sbjct: 252 DRICRRAVSAHVRDTGFIDTIIDDIKEVLA 281 >UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Actinomycetales RepID=C2KP50_9ACTO Length = 312 Score = 109 bits (273), Expect = 1e-22, Method: Compositional matrix adjust. Identities = 74/274 (27%), Positives = 133/274 (48%), Gaps = 17/274 (6%) Query: 9 IPLKDRVSMIFLQYGQI--------DVIDGAFVLIDKTGIR--THIPVGSVACIMLEPGT 58 I L+DRVS ++L+Y Q+ + +G D+ ++ IPV +A + L PGT Sbjct: 19 IRLEDRVSYLYLEYCQVIQNHTGVAAISEGNHDSEDREPLKRIIQIPVAGLAVLFLGPGT 78 Query: 59 RVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK 118 ++ A+ ++ G +++ G G Y+ + S + QA L DE K + Sbjct: 79 SITQPAMASCSRAGLTVIFSGGGGCPYYSHAMALTSSSRWAIAQAHLVADERNARKAAKF 138 Query: 119 MFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 +++ + G ++ Q+RG+EGS ++ Y L++++ V NG R +D D +NQ Sbjct: 139 LYKRQLGIDIERELTISQMRGLEGSLIKKRYRELSREFKV--NGFR---RDTGGEDVLNQ 193 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR 238 ++ LYG +A A G PA+G +H G S ++D+AD+ K + +P +F + Sbjct: 194 ALNLVNGILYGCAASACAALGVNPALGIIHRGDIRSLLFDVADLYKPNAALPISFRSVSK 253 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 + EP + R R L +I ++ +VL Sbjct: 254 D--EPLKFARKEMRRFIYEQNVLENMISILMNVL 285 >UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=Nitrosococcus oceani RepID=Q3J7J6_NITOC Length = 295 Score = 75.1 bits (183), Expect = 3e-12, Method: Compositional matrix adjust. Identities = 78/295 (26%), Positives = 123/295 (41%), Gaps = 35/295 (11%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTH---IPVGSVACIMLEPGTRVSHAA 64 PI R + +L++ ++ D V + G T IP + I+L GT ++ AA Sbjct: 2 PILPSHRQGLYYLEHCRVMAKDERVVYACQEGAFTKFFAIPPANTNVILLGSGTSLTQAA 61 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR- 123 RL A ++ +VG G ++ + Q ++ +L D D RLKV + R Sbjct: 62 ARLLASEQVMVAFVGGGGSPLFLASQNEYRPTEYCQAWMRLWQDNDQRLKVAKTFQRNRA 121 Query: 124 ---------FGEPAPARRSVE----------QLRGIEGSRVRATYALLAKQYGVTW---- 160 EP P + S+E +L G G+ + A A AK+ W Sbjct: 122 EFLMQQWPKLAEPKPHKASLEKLAERYLADIELAGDNGT-ILAQEAKFAKKLYKFWANCT 180 Query: 161 ---NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH-TGKPLSFV 216 N R DP + D N + +YG+ A + G ++ +H T + + V Sbjct: 181 ETENFTR-DPGKRDFNDPFNSYLDHGNYLVYGIAAAVLWVLGIPHSLPVIHGTTRRGALV 239 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 +D+ADIIK V+P AF+ A G D+E+R AC S + L I+ V Sbjct: 240 FDVADIIKDTCVMPIAFQHAA--AGRSDQEMRQACIAWLDESHAMTFLFQSIKRV 292 >UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria RepID=C6CA70_DICDC Length = 333 Score = 58.9 bits (141), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 76/311 (24%), Positives = 132/311 (42%), Gaps = 48/311 (15%) Query: 6 LNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRT---HIPVGSVACIMLEPGTRVSH 62 L I R ++ +LQ+ +I V G + + G ++ +IP+ + + +ML GT V+ Sbjct: 10 LKTILHSKRANIYYLQHCRILVNGGRVEYVTEEGNQSLYWNIPIANTSVVMLGTGTSVTQ 69 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARS-----------DKLLYQAKLALDEDL 111 AA+R A+ G ++ + G G ++A+ + A S + L +E Sbjct: 70 AAMREFARAGVMVGFCGSGGTPLFAANEAEVAVSWLSPQSEYRPTEYLQDWVSFWFNEQQ 129 Query: 112 RLKVVRKMFELRFGEPAP-------ARRS--------VEQL-----RGIEGSR----VRA 147 RL ++R G+ AR S VE L +G+ R V Sbjct: 130 RLAAAIAFQQVRIGQIRQHWLGGRLARESRFTIKPEHVEALLNRYQQGLVDCRTSNDVLV 189 Query: 148 TYALLAKQ-YGVTWNGRRY-DPKDWEKG---DTINQCISAATSCLYGVTEAAILAAGYAP 202 A++ K Y + N Y D ++G D N+ + YG+ A+ G Sbjct: 190 QEAMMTKALYRLAANAVSYGDFTRAKRGGGTDLANRFLDHGNYLAYGLAAVALWVLGLPH 249 Query: 203 AIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKT 260 + +H GK V+D+AD+IK ++P+AF A GE +++ R C FR ++ Sbjct: 250 GLAVLH-GKTRRGGLVFDVADLIKDALILPQAFIAAM--EGEDEQDFRQRCLTAFRQAEA 306 Query: 261 LAKLIPLIEDV 271 L +I ++ V Sbjct: 307 LDVMIDSLQQV 317 >UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonella parvula DSM 2008 RepID=D1BQ37_VEIPT Length = 331 Score = 56.6 bits (135), Expect = 1e-06, Method: Compositional matrix adjust. Identities = 44/147 (29%), Positives = 68/147 (46%), Gaps = 9/147 (6%) Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYDPKDWEKGDTINQCISAATSCLYG 189 + VE LRGIEG R +++L W +GR+ P D +N +S S L Sbjct: 149 KKVETLRGIEGLASRTYFSVLGHVLSEPWEFSGRKRHP----SPDPVNAILSYGYSFLER 204 Query: 190 VTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREV 247 A +L AG IG +H+ + S VYD+ DI + D + ++ R+ P+ + Sbjct: 205 EVRACLLTAGLDVRIGVLHSTNNRKDSLVYDVMDIFRQDIIDRFVLKLLNRHMILPE-DF 263 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAA 274 L+ R F S + K + L ED + A Sbjct: 264 DLSERGCFLSKEANKKWVELYEDYMKA 290 >UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobacteria RepID=C8Q0H7_9GAMM Length = 334 Score = 47.8 bits (112), Expect = 5e-04, Method: Compositional matrix adjust. Identities = 60/304 (19%), Positives = 119/304 (39%), Gaps = 48/304 (15%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAFVLIDKT----GIRTHIPVGSVACIMLEPGTRVSHA 63 P+ L R + +L+ ++ + D V + ++ +IP + A ++L G+ ++ A Sbjct: 28 PLMLSKRACVFYLERVRVILKDDRIVYLTESMQPIEHFYNIPEKNTAFLLLGKGSSITDA 87 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYAS--------------------GQPGGARSDKLLYQA 103 A R A+ ++ + G G ++++ L A Sbjct: 88 AARRLAESNVMVGFCGSGGSPLFSALDLTFLAPQSEYRPTEYMQIWMKAWLDDTTRLLMA 147 Query: 104 KLALDEDLRLKVVRKMFE-----------------LRFGEPAPARRSVEQLRGIEGSRVR 146 K+ L E R+++V+K ++ + F + + + EQL EG + Sbjct: 148 KVLLQE--RIEIVKKYWQKNPLLTSYGIRLDESAVVNFSQAIESAMNQEQLLTAEGRWAK 205 Query: 147 ATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGF 206 Y LA+ G + + + D N + YG A+ G + A+ Sbjct: 206 VLYKSLAEGCGFKFTREEGKNANDDIADIANSYLDHGNYIAYGYAAVALHGLGISFALPM 265 Query: 207 VHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKL 264 +H GK V+D+AD++K V+P+AF A+ G +E R+ +I + L + Sbjct: 266 LH-GKTRRGGLVFDVADLVKDAMVMPQAFISAK--LGHNQKEFRMQLIEICQDQDVLDYM 322 Query: 265 IPLI 268 + Sbjct: 323 FGFV 326 >UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma linguale DSM 74 RepID=D2QT50_9SPHI Length = 351 Score = 46.6 bits (109), Expect = 0.001, Method: Compositional matrix adjust. Identities = 39/130 (30%), Positives = 57/130 (43%), Gaps = 18/130 (13%) Query: 125 GEPAPARRSV----EQLRGIEGSRVRATYALLA----KQYGVTWNGRRYDPKDWEKGDTI 176 GE PA V + LRG+EG+ R + L+ K+Y ++GR P D Sbjct: 157 GEQTPATTCVADVADTLRGLEGTAGRLYFETLSYVLPKEY--QFSGRSSRPAQ----DAF 210 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKA 232 N ++ LYG E ++ AG P +GF+H LS VYD + + D VV + Sbjct: 211 NAFLNYGYGMLYGKVEKTLMMAGLDPYVGFLHRDDYNQLSMVYDFIEPYRGWTDEVVFRL 270 Query: 233 FEIARRNPGE 242 F + N Sbjct: 271 FTAKKVNKAH 280 >UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfotomaculum acetoxidans DSM 771 RepID=C8W2P4_DESAS Length = 545 Score = 45.4 bits (106), Expect = 0.002, Method: Compositional matrix adjust. Identities = 60/262 (22%), Positives = 106/262 (40%), Gaps = 60/262 (22%) Query: 20 LQYGQIDVIDGAFVLIDKTGIRTH----------IPVGSVACIMLEPGTRVSHAAVRLAA 69 L G++ +D + K G R IP+ ++ ++L +S ++L Sbjct: 210 LNLGRVLYVDEPGAYVRKKGERVQVTRDKEVLVDIPLCNLEQLVLAGTVNISAQVIKLLL 269 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKL-LYQAKLALDEDLRLKVV----------RK 118 GT + +V AG + Y S QP ++ L + Q K D +LRLK + Sbjct: 270 DRGTEVHFVSRAG-KYYGSLQPALTKNSALRIAQHKAYQDMELRLKYAVLFVQGKLANMR 328 Query: 119 MFELRFGEPAPARR-------------------SVEQLRGIEGSRVRATYALLAKQYGVT 159 LR+ ++ S+ L GIEG+ R + + Sbjct: 329 TILLRYNRDLKEKQLEEAICRLKSLSKNLYKADSLNSLMGIEGAATREYFRVF------N 382 Query: 160 WNGRRYDPKDWEK------GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH---TG 210 + +++ P ++++ GD +N +S A + L A++ GY P IGF+H G Sbjct: 383 YMIKQHVPFNFQQRSRRPPGDPVNALLSFAYTLLTKDMIASVSIVGYDPYIGFLHRSDYG 442 Query: 211 KP---LSFVYDIADIIKFDTVV 229 +P L F+ + I+ D+VV Sbjct: 443 RPALALDFIEEFRPIVA-DSVV 463 >UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiothermus RepID=C1XN81_MEIRU Length = 323 Score = 45.4 bits (106), Expect = 0.003, Method: Compositional matrix adjust. Identities = 39/116 (33%), Positives = 56/116 (48%), Gaps = 14/116 (12%) Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLA---KQYGVTWNGRRYDPKDWEKGDTINQCISA 182 E P RS+E LRGIEG+ RA +A L YG ++GR P D +N +S Sbjct: 136 EALPQARSLEALRGIEGNAARAYFAGLQAVLAPYG--FSGRNRRPPT----DAVNAALSY 189 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHT-GKPL-SFVYDIADIIK---FDTVVPKAF 233 L G A+ AG P +G +HT G+ + + +D+ + + D VV AF Sbjct: 190 GYMVLLGRVLLALGIAGLHPELGLLHTEGRRVPALAFDLMEEFRVSVVDAVVIAAF 245 >UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular organisms RepID=B8CYA1_HALOH Length = 325 Score = 44.3 bits (103), Expect = 0.006, Method: Compositional matrix adjust. Identities = 40/132 (30%), Positives = 62/132 (46%), Gaps = 19/132 (14%) Query: 114 KVVRKMFELRFGEPAPARRSVEQLR----GIEGSRVRATYA----LLAKQYGVTWNGRRY 165 K ++++ ELR + +E +R G EG+ R +A LL +Y +NGR + Sbjct: 134 KKIKEICELR-NKLEKVTGYIEDVRNTIMGYEGNISRKYFASLSFLLPDRY--KFNGRSF 190 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADII 223 P + D N ++ LYG E A++ AG P +G +HT SFV+D + Sbjct: 191 RPAE----DEFNCLLNYGYGVLYGKVEKALIIAGLDPYVGILHTDGYNKKSFVFDFIEPY 246 Query: 224 K--FDTVVPKAF 233 + D VV K F Sbjct: 247 RHHIDRVVMKLF 258 >UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacterized protein, predicted to be involved in DNA repair) n=1 Tax=Desulfobacterium autotrophicum HRM2 RepID=C0QHV1_DESAH Length = 338 Score = 43.5 bits (101), Expect = 0.009, Method: Compositional matrix adjust. Identities = 34/119 (28%), Positives = 55/119 (46%), Gaps = 10/119 (8%) Query: 107 LDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRR 164 +D L ++ + + E+ GE R+ + G+EG+ RA + +LA+ + GR Sbjct: 132 IDTILAIEKILQTLEVTDGENVIDLRNT--IMGLEGTSARAYFKVLARAMPEKYRFKGRS 189 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 P D N ++ LYG E A + AG P IGF+HT S V+D+ + Sbjct: 190 RRPA----KDPFNAVLNYCYGMLYGKVEKACIIAGLDPFIGFLHTDNYNKKSLVFDLIE 244 >UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=Sulfolobus tokodaii RepID=Q96X75_SULTO Length = 317 Score = 43.5 bits (101), Expect = 0.010, Method: Compositional matrix adjust. Identities = 50/212 (23%), Positives = 88/212 (41%), Gaps = 27/212 (12%) Query: 31 AFVLIDKTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV---RVY 86 FV+ K G + ++ V +++ G V+ A+RLA G ++++ G R++ Sbjct: 32 TFVISKKDGKKVNVSPAEVDQIVIMTSGVTVTSKAIRLALDHGIDIIFLDSRGNPFGRLF 91 Query: 87 ASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVR 146 S + K Y A L +E+ + R++ + + A + + GIEG+ Sbjct: 92 HSEPIKTVETRKAQYLAILKGEEE----IPREIIKSKIKNQANHIKFWFKKLGIEGN--- 144 Query: 147 ATYALLAKQYGVTWNGRRY-----------DPKDWEKGDTINQCISAATSCLYGVTEAAI 195 Y L+ + RY +D E D N + A + LY + + Sbjct: 145 -DYKLIEGKDDDEATAARYYWHALGRIIPMKGRDPESTDPFNVSFNYAYAILYSNIQRVL 203 Query: 196 LAAGYAPAIGFVH---TGKPLSFVYDIADIIK 224 G P GF+H +GKP S VYD +++ K Sbjct: 204 QLVGLDPYAGFIHKDRSGKP-SLVYDFSEMFK 234 >UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfurococcus kamchatkensis 1221n RepID=B8D4S7_DESK1 Length = 328 Score = 43.1 bits (100), Expect = 0.012, Method: Compositional matrix adjust. Identities = 33/115 (28%), Positives = 56/115 (48%), Gaps = 21/115 (18%) Query: 118 KMFELRFGEPAPARRSVEQLRGIEGSRVRATY----ALLAKQYGVTWNGRRYDPKDWEKG 173 +M E F E AR E+LR +E R + L+ K+ G ++ +D + Sbjct: 146 RMLEAEFEE---AR---EKLRQLEAEAARIYWPSLSILIPKELG-------FNSRDQDSE 192 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKF 225 D +N ++ A LYG + ++ AG P GF+HT GKP+ +D+ ++ +F Sbjct: 193 DLVNTSLNYAYGILYGESWKVLVLAGLDPYAGFMHTDRSGKPV-LAFDLIEMFRF 246 >UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscilla marina ATCC 23134 RepID=A1ZHZ5_9SPHI Length = 344 Score = 43.1 bits (100), Expect = 0.014, Method: Compositional matrix adjust. Identities = 35/118 (29%), Positives = 53/118 (44%), Gaps = 14/118 (11%) Query: 135 EQLRGIEGSRVRATYA----LLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + LRG+EG+ R + +LAK+Y + GR P D N ++ LY + Sbjct: 168 DTLRGLEGTAGRLYFETLSYVLAKEY--QFAGRSKRPAH----DAFNAFLNYGYGILYSM 221 Query: 191 TEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKAFEIARRNPGEPD 244 E A++ AG P +GF+H S VYD + + + VV K F + N D Sbjct: 222 VEHALVIAGIDPFVGFMHRDGYNQRSMVYDFIEPYRGHVEQVVVKLFTAKKVNQSHTD 279 >UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwenhoekiella blandensis MED217 RepID=A3XI90_9FLAO Length = 332 Score = 41.6 bits (96), Expect = 0.032, Method: Compositional matrix adjust. Identities = 34/113 (30%), Positives = 48/113 (42%), Gaps = 10/113 (8%) Query: 135 EQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 E RG EGS R + LA ++ GR + P D N ++ A LY E Sbjct: 164 ESFRGWEGSAGRHYFEALATCIPDAYSFKGRSFRPAQ----DEFNALLNYAYGILYSRVE 219 Query: 193 AAILAAGYAPAIGFVHTG--KPLSFVYDIAD--IIKFDTVVPKAFEIARRNPG 241 A++ AG P +GF+H S VYD + I + V K+F + N Sbjct: 220 RALMLAGLDPFVGFMHRDDYNSKSLVYDFIEPYRIYAERFVFKSFSSKKMNKS 272 >UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=Methanospirillum hungatei JF-1 RepID=Q2FL78_METHJ Length = 331 Score = 41.6 bits (96), Expect = 0.036, Method: Compositional matrix adjust. Identities = 51/216 (23%), Positives = 87/216 (40%), Gaps = 38/216 (17%) Query: 41 RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLL 100 RT P+G + + + +S AAVRL G + + G + P G + Sbjct: 35 RTLSPLG-LDLLAIAGDHSISTAAVRLVTSHGGAIALMDGLG-NPFGHFLPLGRSALIEQ 92 Query: 101 YQAKLALDEDLRLKVVRKMF------------------------ELRFGEPAPAR----R 132 Y+A+ + E+ RL++ R + E+R E A + + Sbjct: 93 YEAQASAPEERRLEIARSICTGALENKRTLLSNLERIRGFDLSREIRLVEDAQDKALECQ 152 Query: 133 SVEQLRGIEGSRVRATYALLAKQYGVTWN--GRRYDPKDWEKGDTINQCISAATSCLYGV 190 S++ LRG+EGS A + + + W GR +P D +N +S LY Sbjct: 153 SLDSLRGVEGSGAHAYFQGFSLAFDEEWGFLGRSQNPAT----DPVNSLLSYGYGMLYIQ 208 Query: 191 TEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIK 224 A++ +GY+P G H K + VYD+ + + Sbjct: 209 ARQALVLSGYSPYYGAYHETYKKQEALVYDLVEEFR 244 Searching..................................................done Results from round 2 Score E Sequences producing significant alignments: (bits) Value Sequences used in model and found again: UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular ... 428 e-118 UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomyc... 391 e-107 UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria... 384 e-105 UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococc... 383 e-105 UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 380 e-104 UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 373 e-102 UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular ... 371 e-101 UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subty... 370 e-101 UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomy... 368 e-101 UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiop... 367 e-100 UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria ... 365 1e-99 UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomyc... 359 6e-98 UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicr... 357 3e-97 UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacteriu... 355 1e-96 UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=A... 342 9e-93 UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus ... 341 2e-92 UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus cur... 339 8e-92 UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 338 1e-91 UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=B... 313 5e-84 UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax... 299 7e-80 UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetel... 299 8e-80 UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 296 9e-79 UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodofera... 294 3e-78 UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria... 291 3e-77 UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Ac... 278 2e-73 UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=N... 257 3e-67 UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobac... 236 9e-61 UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria... 231 3e-59 UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonel... 145 1e-33 UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma... 116 9e-25 UniRef50_A3LCN8 Putative uncharacterized protein n=1 Tax=Pseudom... 110 6e-23 Sequences not found previously or not previously below threshold: UniRef50_B7GYY4 CRISPR-associated protein Cas1 n=8 Tax=Proteobac... 213 5e-54 UniRef50_A9BUF1 CRISPR-associated protein Cas1 n=33 Tax=Proteoba... 199 1e-49 UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfoto... 99 3e-19 UniRef50_B9LWK7 CRISPR-associated protein Cas1 n=4 Tax=Halobacte... 95 2e-18 UniRef50_A4X3M4 CRISPR-associated protein, Cas1 family n=4 Tax=A... 94 7e-18 UniRef50_O57912 Putative uncharacterized protein PH0173 n=1 Tax=... 92 3e-17 UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscil... 90 1e-16 UniRef50_A9AYP8 CRISPR-associated protein Cas1 n=1 Tax=Herpetosi... 90 1e-16 UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular ... 89 2e-16 UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=M... 89 2e-16 UniRef50_D2NTT1 Uncharacterized protein predicted to be involved... 88 3e-16 UniRef50_B0VIK5 Putative uncharacterized protein n=1 Tax=Candida... 87 6e-16 UniRef50_Q1CWU5 CRISPR-associated protein Cas1 n=1 Tax=Myxococcu... 86 1e-15 UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=... 86 1e-15 UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwen... 86 2e-15 UniRef50_C7QUZ4 CRISPR-associated protein Cas1 n=8 Tax=Cyanobact... 85 2e-15 UniRef50_C1ZJF3 CRISPR-associated protein, Cas1 family; CRISPR-a... 85 3e-15 UniRef50_C7NA04 CRISPR-associated protein Cas1 n=1 Tax=Leptotric... 84 6e-15 UniRef50_UPI0001C16754 protein of unknown function DUF48 n=1 Tax... 84 8e-15 UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacteri... 84 8e-15 UniRef50_B7KMR5 CRISPR-associated protein Cas1 n=3 Tax=Chroococc... 82 2e-14 UniRef50_D0MKV7 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 82 2e-14 UniRef50_A2BKJ8 Universally conserved protein n=1 Tax=Hypertherm... 80 7e-14 UniRef50_A6UVX9 CRISPR-associated protein Cas1 n=2 Tax=Methanoco... 80 1e-13 UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiotherm... 79 2e-13 UniRef50_A8UXX8 Putative uncharacterized protein n=1 Tax=Hydroge... 79 2e-13 UniRef50_UPI00016C522C CRISPR-associated protein Cas1/Cas4 n=1 T... 78 4e-13 UniRef50_B9K7F7 CRISPR-associated protein, Cas1 family n=1 Tax=T... 78 4e-13 UniRef50_A9GDF7 Putative uncharacterized protein n=1 Tax=Sorangi... 77 8e-13 UniRef50_C1XWQ6 CRISPR-associated protein, Cas1 family n=3 Tax=T... 77 8e-13 UniRef50_A1A2M8 CRISPR-associated DNA polymerase n=14 Tax=Bacter... 77 9e-13 UniRef50_D0YU98 Crispr-associated protein Cas1 n=1 Tax=Mobiluncu... 77 1e-12 UniRef50_C3MX12 CRISPR-associated protein Cas1 n=11 Tax=Sulfolob... 76 1e-12 UniRef50_C4G3M4 Putative uncharacterized protein n=1 Tax=Abiotro... 76 1e-12 UniRef50_C9M4E6 CRISPR-associated protein cas1 n=1 Tax=Lactobaci... 75 2e-12 UniRef50_A6UNF5 CRISPR-associated protein Cas1 n=1 Tax=Methanoco... 75 2e-12 UniRef50_A5UXM3 CRISPR-associated protein, Cas1 family n=4 Tax=C... 75 2e-12 UniRef50_B7IHY4 Cas crispr-associated protein Cas1 n=4 Tax=Bacte... 75 3e-12 UniRef50_B7C8S2 Putative uncharacterized protein n=2 Tax=Eubacte... 75 3e-12 UniRef50_A8ABK8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccu... 75 3e-12 UniRef50_A1BI39 CRISPR-associated protein Cas1 n=5 Tax=Chlorobia... 75 3e-12 UniRef50_Q53W21 Putative uncharacterized protein TTHB145 n=3 Tax... 74 6e-12 UniRef50_C2GEC7 Crispr-associated protein Cas1 n=2 Tax=Corynebac... 74 8e-12 UniRef50_Q65S18 Putative uncharacterized protein n=1 Tax=Mannhei... 73 1e-11 UniRef50_A7GY67 Crispr-associated protein Cas1 n=6 Tax=Campyloba... 73 1e-11 UniRef50_B0TFX3 Crispr-associated protein cas1 n=4 Tax=Clostridi... 72 2e-11 UniRef50_A7HMV0 CRISPR-associated protein Cas1 n=3 Tax=Thermotog... 72 3e-11 UniRef50_C1DUM1 Crispr-associated protein Cas1 n=18 Tax=Bacteria... 72 3e-11 UniRef50_B8GDW2 CRISPR-associated protein Cas1 n=1 Tax=Methanosp... 72 3e-11 UniRef50_D2PIT7 CRISPR-associated protein Cas1 n=2 Tax=Sulfolobu... 72 3e-11 UniRef50_UPI0001C41A73 CRISPR-associated protein Cas1-2 n=1 Tax=... 71 4e-11 UniRef50_Q8F1F5 Putative uncharacterized protein n=2 Tax=Leptosp... 71 5e-11 UniRef50_A5D0Y0 Uncharacterized protein n=40 Tax=cellular organi... 71 5e-11 UniRef50_C9M4Y8 CRISPR-associated protein Cas1 n=1 Tax=Jonquetel... 71 5e-11 UniRef50_C5BP90 CRISPR-associated protein Cas1 n=3 Tax=Gammaprot... 71 5e-11 UniRef50_A4J500 CRISPR-associated protein Cas1 n=1 Tax=Desulfoto... 71 5e-11 UniRef50_Q6L317 DNA polymerase n=2 Tax=Thermoplasmatales RepID=Q... 71 6e-11 UniRef50_A4FXZ8 CRISPR-associated protein, Cas1 family n=9 Tax=c... 71 6e-11 UniRef50_A7NP58 CRISPR-associated protein Cas1 n=6 Tax=Chlorofle... 70 6e-11 UniRef50_D2LF35 CRISPR-associated protein Cas1 n=1 Tax=Rhodomicr... 70 7e-11 UniRef50_Q2SIC8 CRISPR-associated protein Cas1 n=6 Tax=Gammaprot... 70 7e-11 UniRef50_Q74H36 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Geob... 70 8e-11 UniRef50_Q9YCL8 Putative CRISPR-associated protein Cas1 n=1 Tax=... 70 1e-10 UniRef50_A5ILM3 CRISPR-associated protein, Cas1 family n=3 Tax=B... 70 1e-10 UniRef50_C9LM09 CRISPR-associated protein Cas1 n=1 Tax=Dialister... 70 1e-10 UniRef50_D1N0J7 CRISPR-associated protein Cas1 n=1 Tax=Victivall... 70 1e-10 UniRef50_A7HNI6 CRISPR-associated protein Cas1 n=1 Tax=Fervidoba... 70 1e-10 UniRef50_B8GSH8 CRISPR-associated protein Cas1 n=1 Tax=Thioalkal... 70 1e-10 UniRef50_UPI000197AF65 hypothetical protein BACCOPRO_02409 n=1 T... 70 1e-10 UniRef50_A3MVN2 CRISPR-associated protein, Cas1 family n=4 Tax=T... 69 2e-10 UniRef50_B0K547 CRISPR-associated protein Cas1 n=12 Tax=Bacteria... 69 2e-10 UniRef50_Q8YZS6 Alr0381 protein n=6 Tax=Cyanobacteria RepID=Q8YZ... 69 2e-10 UniRef50_B2SPB2 Crispr-associated protein Cas1 n=56 Tax=Bacteria... 68 3e-10 UniRef50_A1WUP2 CRISPR-associated protein Cas1 n=1 Tax=Halorhodo... 68 3e-10 UniRef50_Q2LQX3 Uncharacterized protein predicted to be involved... 68 4e-10 UniRef50_Q6L363 DNA polymerase n=1 Tax=Picrophilus torridus RepI... 68 4e-10 UniRef50_B6IX22 CRISPR-associated protein Cas1, putative n=2 Tax... 68 4e-10 UniRef50_D0KYZ2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 68 5e-10 UniRef50_Q7MRD4 Putative uncharacterized protein n=1 Tax=Wolinel... 68 5e-10 UniRef50_C0BZ41 Putative uncharacterized protein n=3 Tax=Clostri... 67 6e-10 UniRef50_C0FSR1 Putative uncharacterized protein n=1 Tax=Rosebur... 67 6e-10 UniRef50_C9LGP6 CRISPR-associated protein Cas1 n=3 Tax=Prevotell... 67 7e-10 UniRef50_C7RP03 CRISPR-associated protein Cas1 n=1 Tax=Candidatu... 67 7e-10 UniRef50_Q467D6 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Meth... 67 8e-10 UniRef50_A3CTI4 CRISPR-associated protein Cas1 n=1 Tax=Methanocu... 67 8e-10 UniRef50_C3MWK6 CRISPR-associated protein Cas1 n=6 Tax=Sulfolobu... 67 8e-10 UniRef50_B4W4R1 CRISPR-associated protein Cas1 n=1 Tax=Microcole... 67 9e-10 UniRef50_C7G6C1 CRISPR-associated protein Cas1 n=3 Tax=Firmicute... 67 1e-09 UniRef50_B9M9X7 Putative uncharacterized protein n=1 Tax=Diaphor... 67 1e-09 UniRef50_A1HM55 CRISPR-associated protein Cas1 n=2 Tax=Thermosin... 66 1e-09 UniRef50_Q0AW57 CRISPR-associated protein, Cas1 family n=1 Tax=S... 66 1e-09 UniRef50_B9YDC3 Putative uncharacterized protein n=1 Tax=Holdema... 66 1e-09 UniRef50_C3WS02 CRISPR-associated protein n=2 Tax=Fusobacterium ... 66 2e-09 UniRef50_B7A8Y4 CRISPR-associated protein Cas1 n=1 Tax=Thermus a... 66 2e-09 UniRef50_Q2NH78 Putative uncharacterized protein n=1 Tax=Methano... 66 2e-09 UniRef50_Q2RY11 CRISPR-associated protein, Cas1 family / CRISPR-... 65 2e-09 UniRef50_A8ABE8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccu... 65 2e-09 UniRef50_C8WTR3 CRISPR-associated protein Cas1 n=2 Tax=Alicyclob... 65 3e-09 UniRef50_Q74N45 NEQ017 n=1 Tax=Nanoarchaeum equitans RepID=Q74N4... 65 3e-09 UniRef50_B3PMY9 Putative uncharacterized protein n=1 Tax=Mycopla... 65 4e-09 UniRef50_A0LWB2 CRISPR-associated protein Cas1 n=1 Tax=Acidother... 65 4e-09 UniRef50_Q0ADY5 CRISPR-associated protein, Cas1 family n=2 Tax=N... 65 4e-09 UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfuro... 65 4e-09 UniRef50_C8PKY6 Putative CRISPR-associated protein Cas1 n=1 Tax=... 64 5e-09 UniRef50_A3ZPG0 Putative uncharacterized protein n=1 Tax=Blastop... 64 5e-09 UniRef50_D2R8Z2 CRISPR-associated protein Cas1 n=1 Tax=Pirellula... 64 6e-09 UniRef50_B5IAF4 CRISPR-associated protein Cas1 n=3 Tax=Euryarcha... 64 6e-09 UniRef50_C9RRG3 CRISPR-associated protein Cas1 n=1 Tax=Fibrobact... 64 6e-09 UniRef50_Q2J7N9 CRISPR-associated protein Cas1 n=1 Tax=Frankia s... 63 8e-09 UniRef50_Q1J1U7 CRISPR-associated protein Cas1 n=1 Tax=Deinococc... 63 8e-09 UniRef50_D0MKP4 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 63 9e-09 UniRef50_B2RM83 CRISPR-associated protein Cas1 n=3 Tax=Bacteroid... 63 1e-08 UniRef50_B1YCK7 CRISPR-associated protein Cas1 n=2 Tax=Thermopro... 63 1e-08 UniRef50_C0QR16 Crispr-associated protein Cas1 n=23 Tax=Bacteria... 63 1e-08 UniRef50_O28401 Putative uncharacterized protein n=1 Tax=Archaeo... 62 2e-08 UniRef50_B4ATI8 Crispr-associated protein Cas1 n=5 Tax=Proteobac... 62 2e-08 UniRef50_O66692 Putative uncharacterized protein n=2 Tax=Aquific... 62 2e-08 UniRef50_C4FMU2 Putative uncharacterized protein n=1 Tax=Veillon... 62 2e-08 UniRef50_Q1Q3I6 Putative uncharacterized protein n=1 Tax=Candida... 62 3e-08 UniRef50_A2SRR7 CRISPR-associated protein, Cas1 family n=24 Tax=... 62 3e-08 UniRef50_Q2FQQ2 CRISPR-associated protein Cas1 n=2 Tax=Methanomi... 62 3e-08 UniRef50_B8G918 CRISPR-associated protein Cas1 n=5 Tax=Chlorofle... 62 3e-08 UniRef50_C5SD37 CRISPR-associated protein Cas1 n=1 Tax=Allochrom... 62 3e-08 UniRef50_Q3B3C1 CRISPR-associated protein, Cas1 family n=20 Tax=... 61 5e-08 UniRef50_C0W0W5 CRISPR-associated Cas1 family protein n=1 Tax=Ac... 61 5e-08 UniRef50_B3E1C9 CRISPR-associated protein Cas1 n=1 Tax=Methylaci... 61 6e-08 UniRef50_D2QAN8 CRISPR-associated protein Cas1 n=2 Tax=Bifidobac... 61 6e-08 UniRef50_C6HZN2 CRISPR-associated protein Cas1 n=1 Tax=Leptospir... 60 8e-08 UniRef50_B8DWG2 CRISPR-associated protein Cas1 n=4 Tax=Bifidobac... 60 9e-08 UniRef50_B5IHG3 CRISPR-associated protein Cas1 n=3 Tax=Acidulipr... 60 9e-08 UniRef50_A5UJ50 Uncharacterized protein predicted to be involved... 60 9e-08 UniRef50_Q8YWX6 Alr1468 protein n=4 Tax=Cyanobacteria RepID=Q8YW... 60 1e-07 UniRef50_A6DE65 Putative uncharacterized protein n=1 Tax=Caminib... 60 1e-07 UniRef50_A0LHZ4 CRISPR-associated protein, Cas1 family n=2 Tax=D... 60 1e-07 UniRef50_C0WRP8 CRISPR-associated protein n=18 Tax=Lactobacillac... 60 1e-07 UniRef50_Q8TVS6 Uncharacterized conserved protein n=1 Tax=Methan... 60 1e-07 UniRef50_B0JKW9 CRISPR-associated protein Cas1 n=6 Tax=Cyanobact... 59 2e-07 UniRef50_C6I8L1 CRISPR-associated protein n=1 Tax=Bacteroides sp... 59 2e-07 UniRef50_B9LX94 CRISPR-associated protein Cas1 n=2 Tax=Halobacte... 59 2e-07 UniRef50_Q1WVJ8 CRISPR-associated protein n=1 Tax=Lactobacillus ... 59 2e-07 UniRef50_Q1CW50 CRISPR-associated fusion protein Cas4/Cas1 n=5 T... 58 3e-07 UniRef50_Q03KT5 CRISPR-associated protein, Cas1 family n=5 Tax=S... 58 4e-07 UniRef50_Q5X8T5 Putative uncharacterized protein n=1 Tax=Legione... 58 5e-07 UniRef50_A4FJX8 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Sacc... 57 7e-07 UniRef50_B3EG05 CRISPR-associated protein Cas1 n=11 Tax=Bacteria... 57 8e-07 UniRef50_B8GLF6 CRISPR-associated Cas1/Cas4 family protein n=1 T... 57 9e-07 UniRef50_Q8F874 Putative uncharacterized protein n=1 Tax=Leptosp... 57 1e-06 UniRef50_A1RZT8 CRISPR-associated protein Cas1 n=1 Tax=Thermofil... 57 1e-06 UniRef50_A7I668 CRISPR-associated protein Cas1 n=1 Tax=Candidatu... 57 1e-06 UniRef50_D0LSW9 CRISPR-associated protein Cas1 n=1 Tax=Haliangiu... 56 1e-06 UniRef50_D0W646 CRISPR-associated protein Cas1 n=1 Tax=Neisseria... 56 2e-06 UniRef50_B7KM77 CRISPR-associated protein Cas1 n=1 Tax=Cyanothec... 55 2e-06 UniRef50_B1GZM4 CRISPR-associated protein Cas1 n=1 Tax=unculture... 55 3e-06 UniRef50_UPI0000F51762 hypothetical protein Faci_00015 n=1 Tax=F... 55 3e-06 UniRef50_D0MJ58 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 55 4e-06 UniRef50_C3WD45 CRISPR-associated protein cas1 n=1 Tax=Fusobacte... 55 4e-06 UniRef50_A8REI1 Putative uncharacterized protein n=2 Tax=unclass... 55 4e-06 UniRef50_B3W9S5 CRISPR-associated protein n=4 Tax=Lactobacillus ... 55 4e-06 UniRef50_Q2FPW6 CRISPR-associated protein Cas1 n=1 Tax=Methanosp... 55 5e-06 UniRef50_A1WH94 CRISPR-associated protein, Cas1 family n=5 Tax=P... 54 5e-06 UniRef50_A3DLB7 CRISPR-associated protein, Cas1 family n=1 Tax=S... 54 6e-06 UniRef50_A7BYC5 Protein containing DUF48 n=1 Tax=Beggiatoa sp. P... 54 6e-06 UniRef50_B5YJS2 Crispr-associated protein Cas1 n=1 Tax=Thermodes... 54 7e-06 UniRef50_C0A724 CRISPR-associated Cas1/Cas4 family protein n=1 T... 54 8e-06 UniRef50_C6MJ62 CRISPR-associated protein Cas1 n=5 Tax=Nitrosomo... 53 1e-05 UniRef50_Q7MTH7 CRISPR-associated protein Cas1 n=3 Tax=Porphyrom... 53 1e-05 UniRef50_B5ZLL1 CRISPR-associated protein Cas1 n=2 Tax=Gluconace... 53 1e-05 UniRef50_A1BI46 CRISPR-associated protein Cas1 n=2 Tax=Chlorobia... 53 1e-05 UniRef50_UPI0001BCCAFD hypothetical protein SnoxA4_00467 n=1 Tax... 53 2e-05 UniRef50_C8W3G7 CRISPR-associated protein Cas1 n=24 Tax=Bacteria... 52 2e-05 UniRef50_C5EZ73 Crispr-protein cas1 n=1 Tax=Helicobacter pulloru... 52 2e-05 UniRef50_A7HP88 CRISPR-associated protein Cas1 n=1 Tax=Parvibacu... 52 2e-05 UniRef50_C7V674 Predicted protein n=2 Tax=Enterococcus faecalis ... 52 2e-05 UniRef50_Q13CC1 CRISPR-associated protein, Cas1 family n=2 Tax=R... 52 3e-05 UniRef50_Q6A5T6 Putative uncharacterized protein n=1 Tax=Propion... 52 4e-05 UniRef50_Q03LF6 CRISPR-associated protein, Cas1 family n=6 Tax=S... 51 4e-05 UniRef50_B1X158 DUF48-containing protein n=12 Tax=Cyanobacteria ... 51 5e-05 UniRef50_B2KB47 CRISPR-associated protein Cas1 n=2 Tax=Elusimicr... 50 8e-05 UniRef50_B1L400 CRISPR-associated protein Cas1 n=2 Tax=Archaea R... 50 1e-04 UniRef50_B9M5J4 CRISPR-associated protein Cas1 n=9 Tax=Bacteria ... 50 1e-04 UniRef50_B4AQ39 Crispr-associated protein Cas1 n=5 Tax=Francisel... 50 1e-04 UniRef50_Q5LZX6 Putative uncharacterized protein n=2 Tax=Strepto... 50 1e-04 UniRef50_A2SQK9 Uncharacterized protein predicted to be involved... 50 1e-04 UniRef50_C7XMU1 CRISPR-associated protein cas1 n=2 Tax=Fusobacte... 49 2e-04 UniRef50_D1VVR4 CRISPR-associated endonuclease Cas1, DVULG subty... 49 2e-04 UniRef50_A8TI31 CRISPR-associated protein Cas1 n=1 Tax=Methanoco... 49 2e-04 UniRef50_P71636 CRISPR-associated protein Cas1 n=11 Tax=Mycobact... 48 4e-04 UniRef50_B2UP48 CRISPR-associated protein Cas1 n=1 Tax=Akkermans... 48 5e-04 UniRef50_UPI00016B206F hypothetical protein cdiviTM7_00753 n=1 T... 48 6e-04 UniRef50_A8LN06 CRISPR-associated protein Cas1 n=2 Tax=Rhodobact... 48 6e-04 UniRef50_C8N6V7 Putative uncharacterized protein n=1 Tax=Cardiob... 47 8e-04 UniRef50_A6DE79 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Cami... 47 0.001 UniRef50_A6VLA8 CRISPR-associated protein Cas1 n=13 Tax=Proteoba... 47 0.001 UniRef50_A9AX66 CRISPR-associated protein Cas1 n=1 Tax=Herpetosi... 47 0.001 UniRef50_C9RJP2 CRISPR-associated protein Cas1 n=21 Tax=cellular... 46 0.001 UniRef50_C0WE67 Crispr-associated protein n=1 Tax=Acidaminococcu... 45 0.002 UniRef50_B0VHC1 Putative uncharacterized protein n=1 Tax=Candida... 45 0.002 UniRef50_C7G696 CRISPR-associated protein Cas1 n=2 Tax=Roseburia... 45 0.003 UniRef50_Q03JI7 CRISPR-associated protein, Cas1 family n=40 Tax=... 45 0.003 UniRef50_D0WRI5 CRISPR-associated protein Cas1, NMENI subtype n=... 45 0.004 UniRef50_B3DR65 Putative uncharacterized protein n=1 Tax=Bifidob... 45 0.005 UniRef50_B9CMG3 CRISPR-associated protein Cas1, NMENI subtype n=... 44 0.005 UniRef50_UPI000174611D CRISPR-associated protein Cas1/Cas4 n=1 T... 44 0.006 UniRef50_C2EF73 CRISPR-associated Cas1 family protein n=1 Tax=La... 44 0.007 UniRef50_D1AUW5 CRISPR-associated protein Cas1 n=1 Tax=Streptoba... 44 0.008 UniRef50_C8PNV3 CRISPR-associated protein Cas1 n=1 Tax=Treponema... 43 0.013 UniRef50_B8I084 CRISPR-associated protein Cas1 n=1 Tax=Clostridi... 42 0.024 UniRef50_A9GQD8 Putative uncharacterized protein n=1 Tax=Sorangi... 40 0.080 >UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular organisms RepID=YGBT_ECOLI Length = 305 Score = 428 bits (1100), Expect = e-118, Method: Composition-based stats. Identities = 305/305 (100%), Positives = 305/305 (100%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV Sbjct: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF Sbjct: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 Query: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI Sbjct: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP Sbjct: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 Query: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA Sbjct: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 Query: 301 GHRSS 305 GHRSS Sbjct: 301 GHRSS 305 >UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomycetales RepID=D1A5U2_THECD Length = 315 Score = 391 bits (1005), Expect = e-107, Method: Composition-based stats. Identities = 105/297 (35%), Positives = 155/297 (52%), Gaps = 7/297 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + DR+S I+L+ + D A D GI THIP ++ C++L PGTRV+H A Sbjct: 12 PRELTRMSDRISFIYLERCTLHREDNAITAEDADGI-THIPSATIGCLLLGPGTRVTHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + G +VWVGE GVR Y+ G+ S + QA + RL+V R M+ +RF Sbjct: 71 MSVLGDSGAGVVWVGEQGVRFYSGGRSLTRSSALVEAQAIKWANRRTRLEVARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + PA + ++L G EG RV+ Y A +YG+TW GR Y P D+ D +NQ ++AA Sbjct: 131 PDEDPAGLTRQELLGREGRRVKERYRQEAAKYGITWKGRHYIPGDFGSSDPVNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ + + A G +P +GF+H+G L+FV DIAD+ K + +P AF +P + Sbjct: 191 QCMYGIAQTTVAALGCSPGLGFIHSGHELAFVLDIADLYKTEFALPIAFRTVAESPEDVG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDAG 301 R A RD L + + I+ +L P P+D I GD G Sbjct: 251 SRTRRAIRDEVNRVGLLRRCVDDIKSLLL------PDVPDDPLNSDIDQVTLQGDHG 301 >UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria RepID=D1NTI3_9BIFI Length = 366 Score = 384 bits (987), Expect = e-105, Method: Composition-based stats. Identities = 106/296 (35%), Positives = 162/296 (54%), Gaps = 4/296 (1%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 + +DR++ ++ ++ ++ + A + D G+R HIP +++ +ML PGT V+H A+ Sbjct: 32 ELVRCEDRLTFLYFEHCVVNRDNNAITVTDDRGVR-HIPAAALSVLMLGPGTSVTHQAMM 90 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 + G ++WVGE GVR Y SG+P S+ L QA+L + RL V R M+ +RF Sbjct: 91 VIGDNGATVIWVGERGVRTYCSGKPLTHSSNLLQKQAQLVTNMRKRLSVARAMYAMRFPH 150 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + +++QLRG EG+RVR Y +KQ GV W R Y P+D+ D INQ +SAA C Sbjct: 151 EDVSNLTMQQLRGREGARVRRVYRHWSKQTGVRWERRDYRPEDFADSDRINQALSAANIC 210 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 LYG+ A I+A G +P +GFVHTG LSFVYD+AD+ K + +P AF+ A + Sbjct: 211 LYGIAHAVIVALGCSPGLGFVHTGHELSFVYDMADLYKAELSIPVAFKTAATEVDDIGGA 270 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP---EDAQPVAIPLPVSLGD 299 VR A RD + +++ I + A + + D + + S GD Sbjct: 271 VRRAMRDAMYDLSIMPRMVKDIHHLFDAADAENEGNNLYLWDGKEGTVEAGRSYGD 326 >UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococcales RepID=Q2JWC7_SYNJA Length = 315 Score = 383 bits (983), Expect = e-105, Method: Composition-based stats. Identities = 112/299 (37%), Positives = 175/299 (58%), Gaps = 6/299 (2%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 L IP ++D +S ++++ +I+ A ++ + G R IP S+ +ML PGT ++HAA Sbjct: 10 LRSIPKVRDSISFVYVERCRIEQDAKAIAVLQEDG-RYIIPCASLTTLMLGPGTAITHAA 68 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ A + WVGE G+R YASG + ++L +QAKL D ++VVR+M+ RF Sbjct: 69 IKNLADGLCSVQWVGEDGLRFYASGSHPSSSVERLYHQAKLWADPVQHMEVVRRMYSFRF 128 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 EP ++EQ+RG+EG RVR Y+ L+K+ GV W GR Y K+WE D +N+ +S A Sbjct: 129 PEPLKEGLTLEQIRGLEGVRVRTVYSRLSKETGVNWKGRSYKLKEWECADPVNRALSVAN 188 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +CLY V +AA+ A GY+ A+GF+H GKPLSFVYD+AD+ K + +P AF+ A + Sbjct: 189 TCLYAVCQAALNAVGYSTALGFIHIGKPLSFVYDVADLYKTEITIPVAFKAAAELMPNFE 248 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 R CR+ F + + ++I ++ +L Q + P D + A+ VS G+ Sbjct: 249 SRTRQLCREKFVEHRLMQRIIDDVDAILGFRATQEESSPVGSLWDNEKGAVEGGVSYGE 307 >UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=C7QEM2_CATAD Length = 323 Score = 380 bits (975), Expect = e-104, Method: Composition-based stats. Identities = 95/279 (34%), Positives = 152/279 (54%), Gaps = 1/279 (0%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + DRVS ++L+ + A D GI THIP ++ ++L PGTR++H A Sbjct: 12 PYQLPRIADRVSFVYLERCTVHRDANAITAQDADGI-THIPSATIGTLLLGPGTRITHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + G + WVGE G R YA+ + S + QA L + RL + R M+ +RF Sbjct: 71 MAVLGDCGANVAWVGEHGARFYAAARSLNRSSALVEAQATLWANRRTRLDIARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + P+ +QL G+EG R++ Y +++ GV W+GR+Y P ++ GD INQ I+AA Sbjct: 131 PDEDPSGFMRQQLLGMEGRRLKDCYRQQSQRTGVPWHGRQYTPGNFNAGDAINQAITAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YGV I A G +P +GF+H+G LSFV DIAD+ K + +P AF+ A + + Sbjct: 191 QCMYGVAHTIITALGCSPGLGFIHSGHELSFVMDIADLYKTEIGIPVAFDTAAEDSTDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP 283 R A R+ R+++ L + + ++ +L +P + Sbjct: 251 PRTRRALREQIRTTRLLERCVDDVKALLTTPNNEPGSSD 289 >UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=D1CAI8_SPHTD Length = 314 Score = 373 bits (958), Expect = e-102, Method: Composition-based stats. Identities = 117/288 (40%), Positives = 175/288 (60%), Gaps = 6/288 (2%) Query: 4 LPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 + L+ +P ++D S +++++ +I+ A + D G+ +P S+ +ML PGT +SH Sbjct: 1 MDLHILPKVRDSWSYLYVEHARIEQEAKAIAIHDAVGM-VPVPCASLGILMLGPGTSISH 59 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AA+R A+ G L++W GE GVR YA G + L+ QA+L D LRL+VV +M+++ Sbjct: 60 AAIRTLAENGCLVLWTGEEGVRFYAQGLGETRSARNLMRQARLWADPALRLRVVFRMYQM 119 Query: 123 RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 RF EP P +++Q+RG+EG+RVR YA +++ GV W GR + ++W D IN+ +S Sbjct: 120 RFSEPLPPDLTLQQIRGMEGARVRDAYARASRETGVPWRGRSFQRRNWSATDPINRALSC 179 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 A SCLYG+ AAI++ GY+P +GF+HTGK LSFVYDIAD+ K +P AF + + Sbjct: 180 ANSCLYGICHAAIVSLGYSPGLGFIHTGKMLSFVYDIADLYKATVTIPLAFRVVAEGTHD 239 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDA 286 + VR ACRD F + + L + IE VL A G P EDA Sbjct: 240 LEGRVRRACRDAFVAHRLLGTIATDIEHVLDISDADGGADEPDFDEDA 287 >UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular organisms RepID=D1CGD6_THET1 Length = 324 Score = 371 bits (952), Expect = e-101, Method: Composition-based stats. Identities = 125/299 (41%), Positives = 177/299 (59%), Gaps = 5/299 (1%) Query: 1 MTWLPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTR 59 M L+ +P + D S +++++ +ID A L D TG +T +P S++ +ML PGT Sbjct: 1 MRLKDLHILPRVSDSWSYLYVEHCRIDQDARAISLHDATG-KTMVPCASLSLLMLGPGTS 59 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM 119 ++HAA++ A G L+ WVGE GVR YA G + L QA L D +L L+VVR+M Sbjct: 60 ITHAAIQTLADNGCLVAWVGEEGVRFYAQGMGETRSATNTLRQAMLWSDPELHLQVVRRM 119 Query: 120 FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQC 179 +E+RF P S++Q+RG+EG+RVR Y L+++ GV W GR Y K W D IN+ Sbjct: 120 YEIRFRHPINPNTSLKQIRGMEGARVRGAYLQLSRETGVEWKGRDYSSKSWHSNDAINRA 179 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRN 239 ISAA SCLYGV AAI++AGY+ A+GF+HTGK LSFVYD+AD+ K + +P AF Sbjct: 180 ISAANSCLYGVCHAAIVSAGYSTALGFIHTGKMLSFVYDVADLYKTEISMPAAFYAVAEG 239 Query: 240 PGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLG 298 + VR CRDI R ++ LA+++ I+ VL + P P + P V G Sbjct: 240 GASLESRVRRKCRDILRETRLLARIVEDIDTVL---NVDSPIPHKYQNPYDSDPGVPGG 295 >UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subtype n=3 Tax=Bacteria RepID=D2RB04_GARVA Length = 313 Score = 370 bits (951), Expect = e-101, Method: Composition-based stats. Identities = 105/296 (35%), Positives = 165/296 (55%), Gaps = 7/296 (2%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 + DRVS I++++ +I+ +D A + D G +P + ++L PGT ++H A+ L Sbjct: 17 RISDRVSFIYVEHAKINRLDSAVTVFDANGT-IRVPAAMIGVLLLGPGTEITHRAMELLG 75 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP 129 VG +VWVGE GVR YA G+ S L Q+KL + RL V RKM+++RF Sbjct: 76 DVGASIVWVGEHGVRNYAHGRALSRSSRLLEKQSKLVTNSRSRLNVARKMYQMRFPNENV 135 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + +++QLRG EG+RVR Y ++ +Y V WNGR Y D+E G +N+ +S CLYG Sbjct: 136 SSYTLQQLRGREGARVRHLYREMSNKYNVQWNGRDYKVNDFESGTVVNKALSVGNVCLYG 195 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP--DREV 247 + + I A G AP +GFVHTG LS VYDIAD+ K + +P +FEIA R + ++ + Sbjct: 196 LVHSIISALGLAPGLGFVHTGHDLSLVYDIADLYKAELTIPASFEIAARCESDDDIEQLM 255 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 RL RD F + +++++ I+++L D + + + V+ + Sbjct: 256 RLKMRDCFANCNIMSRIVNDIQNLLEIPIDDQITVDVIHLWDDKELLVASGVNYSE 311 >UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomyces RepID=B4V4P5_9ACTO Length = 315 Score = 368 bits (946), Expect = e-101, Method: Composition-based stats. Identities = 97/281 (34%), Positives = 149/281 (53%), Gaps = 1/281 (0%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G THIP ++ ++L PGTR++H A Sbjct: 12 PRELTRVAERISFVYLERCVVHRDANAITAEDADGT-THIPSATIGTLLLGPGTRITHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + A+ G + WVGE GVR YA G+ S + QA L + RL+V R M+ LRF Sbjct: 71 MSVLAESGAAVAWVGEQGVRYYAGGRALSRSSALVEAQATLWANRRTRLEVARAMYRLRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + P+ + +L G EG RV+ Y A + GV W GR Y P D+ GD NQ ++AA Sbjct: 131 PDEDPSGLTRRELLGHEGYRVKECYRHQADRTGVPWRGRHYVPGDFTAGDAPNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ A + A G A +GFVH+G LSFV D+AD+ K + +P AF++A + + Sbjct: 191 QCMYGIAHAVVAALGCATGLGFVHSGHELSFVLDVADLYKTEIGIPVAFDVAAESTEDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPED 285 R A RD ++ L + + I+ +L + +D Sbjct: 251 SRTRRALRDAVNKNRLLDRCVNDIKLLLQPEGPGAASAADD 291 >UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111 RepID=C1YVP5_NOCDA Length = 327 Score = 367 bits (943), Expect = e-100, Method: Composition-based stats. Identities = 102/302 (33%), Positives = 148/302 (49%), Gaps = 6/302 (1%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G R +P ++ ++L PGT V+H+A Sbjct: 12 PRELTRVGERLSFLYLERCVVHRDSNAITAEDGDGTRY-LPSATIGTLLLGPGTNVTHSA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + L + G +VWVGE GVR YA+G+ S + QA + RL V R M+ +RF Sbjct: 71 MSLLGECGATVVWVGEHGVRYYAAGRALTRSSRLVEAQATAWANRRSRLDVARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + S + L G EG RV+A Y A + GVTW GRRY P D + D N+ I+AA Sbjct: 131 PDLDVEALSRQALLGKEGDRVKACYREQAARTGVTWRGRRYVPGDHDVSDPPNKAITAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C YGV A A G +P +GFVH+G FV D+AD+ K + +P AF+ A + + D Sbjct: 191 QCFYGVAHAVTAALGCSPGLGFVHSGHERGFVMDVADLYKVEIGIPVAFDAAAQGDEDVD 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIED-VLAAGEIQPPAPPEDAQPVAIPLPV----SLGD 299 R RD L + + I+ +L G + EDA A +GD Sbjct: 251 GVTRRLLRDRINEEGLLERCVRDIKALLLGEGSVGAQGEAEDAGESANDDVADTVGLVGD 310 Query: 300 AG 301 G Sbjct: 311 RG 312 >UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria RepID=Q3ZZ81_DEHSC Length = 309 Score = 365 bits (936), Expect = 1e-99, Method: Composition-based stats. Identities = 118/280 (42%), Positives = 179/280 (63%), Gaps = 2/280 (0%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 L+ +P +DR S ++L+ G++DV + + +P+ + +ML PG+ V+HAA Sbjct: 4 LHELPRFRDRWSYLYLEMGRLDVEADSL-GFHQGDTVVPVPIDQLGVVMLGPGSTVTHAA 62 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ +Q L+ W G+ GVR+YA+ G + +L+ QA+L D++ RL+V +M+ RF Sbjct: 63 IKSLSQNNCLIAWTGQDGVRLYAASIGGTYSARRLIRQARLVSDDEKRLEVAWRMYRFRF 122 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 E P S+E +RG+EG RVR YA +++YGV W GR YD KDW KGD IN+ +SAA Sbjct: 123 NEVIPPVVSLESIRGMEGIRVRRAYAKASQEYGVEWKGRHYDQKDWSKGDPINRALSAAN 182 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +CLYG+ A IL+AGY+ A+GFVHTGK LSFVYD+AD+ K + +P AF++A NP + + Sbjct: 183 ACLYGICHAGILSAGYSSALGFVHTGKMLSFVYDVADLYKTELTIPVAFKVAAANPTDLE 242 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPE 284 R+VR+ CR+ F K L +L+ I +VL + +P E Sbjct: 243 RQVRIECREAFYEFKLLERLLTDIAEVLGVSDDIGESPDE 282 >UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomycetales RepID=A8M406_SALAI Length = 322 Score = 359 bits (922), Expect = 6e-98, Method: Composition-based stats. Identities = 114/288 (39%), Positives = 160/288 (55%), Gaps = 5/288 (1%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 +DR+S ++L+ I A D+ GI HIP ++ +ML PGT ++ A+ L A Sbjct: 16 RAQDRISFVYLERCVIHRDSNAITATDEKGI-VHIPAATLGVLMLGPGTSITQQAMMLIA 74 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP 129 G +VW+GE GVR YA G+P S L+ QA D RL+V R M+ +RF Sbjct: 75 DNGATVVWIGEHGVRYYAHGRPLARSSRLLVAQAAAVSHRDRRLRVARAMYRMRFPGEDT 134 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 +++QLRG EG+RVR Y A++ GV+WN R YDP D+ D +NQ +SAA +CLYG Sbjct: 135 TNLTMQQLRGKEGARVRRCYRENAQRTGVSWNSREYDPDDFTGSDPVNQALSAAHACLYG 194 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRL 249 + A ++A G +P +GFVHTG SFVYDIAD+ K D +P AF+IA + + R Sbjct: 195 IVHAVVVAVGASPGLGFVHTGHDRSFVYDIADLYKADVTIPVAFDIAAAESTDIGADTRR 254 Query: 250 ACRDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDAQPVAIPL 293 A RD + L + + I +L AAG I E+A A+ L Sbjct: 255 AVRDRVHNGALLGRCVQDIRRLLLTDSAAGPINEEEFDEEADNDAVRL 302 >UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicrobium ferrooxidans DSM 10331 RepID=C7LYW4_ACIFD Length = 314 Score = 357 bits (917), Expect = 3e-97, Method: Composition-based stats. Identities = 98/277 (35%), Positives = 158/277 (57%), Gaps = 1/277 (0%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 P+ R S ++L++ + A V + ++G T++P +V ++L PGTR++H A+ Sbjct: 17 ELQPVSRRSSFVYLEHCVVHRDANAVVSVTESGT-TYLPAAAVGTLLLGPGTRITHQAML 75 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 L + G ++ WVGE R+YA + L QA+L + RL+V R+M+++RF Sbjct: 76 LLGESGVVVCWVGEGDTRLYAWAPSLFQSTRFLEAQARLVSNRQDRLRVARQMYQMRFPG 135 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 ++ ++++LRG+EG+R+R TY LA +G+ W+GR YDP + GD +N+ +S A S Sbjct: 136 EDVSKATMQRLRGMEGARIRRTYRHLASAFGIDWHGRHYDPNNSSAGDDVNRALSIANSV 195 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 LYGV AI+A G +P +GFVHTG LSFVYD+AD+ K + +P AFE A + G + Sbjct: 196 LYGVVHTAIVALGCSPGLGFVHTGHSLSFVYDVADLYKVELAIPVAFEAAAQRTGSLSSQ 255 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP 283 VR R+ + L + + I +L + P Sbjct: 256 VRRTMRERIHEAHLLERAVDDIRLLLGTPDADLGGEP 292 >UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacterium RepID=B1VIX8_CORU7 Length = 312 Score = 355 bits (911), Expect = 1e-96, Method: Composition-based stats. Identities = 93/269 (34%), Positives = 154/269 (57%), Gaps = 1/269 (0%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + DR+S ++++ + A + D+ G+ H+P +A ++L GTR+++AA+ L Sbjct: 17 MGDRISFLYVERAVVSRDGNALTITDQRGV-AHVPATQLAVLLLGTGTRITNAAMALLGD 75 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G VWVGE GVR YA G+P S QA++ ++ RL+ R+M+ LRF + Sbjct: 76 CGVSTVWVGERGVRYYAHGRPPAKSSRLAELQARVVTNQRKRLECARRMYGLRFPGEDVS 135 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + ++ QLRG EG+R++ YA AK+ GV WN RRYDP D++ D INQ ++ ++ LYG+ Sbjct: 136 KLTMAQLRGREGARMKRLYAAEAKRTGVAWNRRRYDPNDYDSSDPINQALTTGSAALYGI 195 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A I+ G+ PA+G +HTG SFVYD+AD+ K + +P AF + VR Sbjct: 196 AHAVIVGLGFVPALGVIHTGTDRSFVYDVADLYKAEVSIPAAFNAVASGTEDVGPMVRRL 255 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 RD + + +++ ++ V++ + +P Sbjct: 256 VRDAVVEQRLMPRMVRDLKFVMSVPDDEP 284 >UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=Actinomycetales RepID=C7MTM5_SACVD Length = 342 Score = 342 bits (877), Expect = 9e-93, Method: Composition-based stats. Identities = 112/300 (37%), Positives = 169/300 (56%), Gaps = 7/300 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + L DRVS ++++ +D + A +I++ +P VA ++L PGTRV+H A Sbjct: 12 PHDLHRLTDRVSSVYIERSHLDRAENAIAIINRRET-VRLPAALVAVVLLGPGTRVTHGA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++L A GT + WVGE GVR+YA+G + L QA L RL+V R M+ +RF Sbjct: 71 MQLLADSGTAVCWVGEQGVRMYAAGLGPSRGAALLQRQAYLVSRTTTRLEVARAMYAMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD-WEKGDTINQCISAA 183 + +++QLRG EG+RVR Y A+Q+GV WNGR Y D + GD +N+ +SAA Sbjct: 131 PGEDVSTLTMQQLRGREGARVRKVYRQQARQHGVPWNGRAYKAGDAFAVGDDLNRLLSAA 190 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 + LYG+ A I+ G +P +GF+HTG SFV DIAD+ K + +P AF++A R E Sbjct: 191 NAALYGICHAVIVGLGASPGLGFIHTGSATSFVMDIADLYKAEYTIPLAFQLAARGLLE- 249 Query: 244 DREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 +R+ R A RD + L ++I ++ +LA + P P D + +P V+ D Sbjct: 250 ERDARTALRDRIAGTGLLPRIIKDVKTLLAPEGVDLPDPEVNLLWDERGNPVPGGVNWSD 309 >UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus RepID=Q03C58_LACC3 Length = 315 Score = 341 bits (875), Expect = 2e-92, Method: Composition-based stats. Identities = 102/265 (38%), Positives = 161/265 (60%), Gaps = 3/265 (1%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 +++RV+ ++L++ +I+ D A V+ID TG IP ++ +ML PG V+H A+ Sbjct: 12 ELSRVRERVTFLYLEHAKINRQDSAIVVID-TGGTVAIPAALISVLMLGPGVDVTHRAME 70 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 L G +VWVGE GVR YA G+ S L+ QAKL + LR+ V R+M+++RF + Sbjct: 71 LMGDAGMSVVWVGERGVRQYAPGRALTHSSALLVAQAKLVSNNRLRVGVARQMYQMRFPD 130 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + S+++LRG EG+RVR Y +++ GV W R YDP++++ G INQ ++AA + Sbjct: 131 DDVSTLSMQELRGKEGARVRRIYREESRRTGVEWTHREYDPENYQSGSIINQALTAAHAA 190 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIA--RRNPGEPD 244 LYG++ + I+A G +P +GFVHTG LSFVYD AD+ K + +P AF +A + Sbjct: 191 LYGLSYSVIVALGASPGLGFVHTGHDLSFVYDFADLYKAEVTIPIAFTVAANATEQDDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIE 269 + RLA RD F K + +++ ++ Sbjct: 251 QLTRLAVRDAFVDGKLMIRMVADLK 275 >UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus curtisii ATCC 43063 RepID=C2BS02_9ACTO Length = 314 Score = 339 bits (869), Expect = 8e-92, Method: Composition-based stats. Identities = 87/270 (32%), Positives = 148/270 (54%), Gaps = 1/270 (0%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 ++DR+S ++++ ++ A + D GI HIP V ++L PGT+V++AA+ L Sbjct: 17 RMEDRLSFLYVERAILNREGNALTIQDSRGI-AHIPATQVGVVLLGPGTKVTYAAMALLG 75 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP 129 G VWVGE GVR YA G+P S A+L ++ RL+ R+M+ +RF Sbjct: 76 DAGCSAVWVGEKGVRYYAHGRPAAKTSRMAEAHARLWANQRSRLRCARRMYSMRFPGEDV 135 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + + QLRG EG+R++ YA +++ GV W R YDP D+ GD IN ++ + LYG Sbjct: 136 SNLPLSQLRGREGARMKRIYAEQSRRTGVPWTRRSYDPNDFGAGDPINCALTEGAAALYG 195 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRL 249 + A ++ G+ P++G +H+G +FVYD+AD+ K + +P AFE + + VR Sbjct: 196 IAHAVVVGLGFIPSLGIIHSGTDRAFVYDVADLYKAEISIPAAFEAVAASAEGDELNVRK 255 Query: 250 ACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 RD +++ + +++ ++ V+ Sbjct: 256 RIRDKVVTTRLMQRMVRDLQYVMEIPTDDA 285 >UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=C7MTL6_SACVD Length = 328 Score = 338 bits (868), Expect = 1e-91, Method: Composition-based stats. Identities = 96/292 (32%), Positives = 160/292 (54%), Gaps = 12/292 (4%) Query: 10 PLKDRVSMIFLQYGQIDVID-GAFVLI---DKTGIRTHIPVGSVACIMLEPGTRVSHAAV 65 + D +S ++L+ ++ D G + D R ++PV +++CI+ GT V+ A+ Sbjct: 22 RVADSLSFLYLENVRVVQDDTGVCAYVEQPDGGTSRVYLPVAAISCILFGTGTSVTQPAM 81 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG 125 A+ T ++W G GVR+Y+ ++ L Q + D+ RL V +M+ +RFG Sbjct: 82 ATCARHNTTVLWTGSGGVRMYSGSLAPNLTTEWLERQVRAWADDSTRLAVAARMYSMRFG 141 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 PA S+ LRG+EG R++A Y LA ++G+ R YDP +W + + +NQ +SAA + Sbjct: 142 AEVPAGTSLNTLRGLEGQRMKALYRSLADRHGLRGFKRNYDPANWGEQNPVNQALSAANT 201 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 LYG +A+LA G +PA+GF+H+GK SFVYD+AD+ K +P AF + + +PDR Sbjct: 202 ALYGAVHSALLALGCSPALGFIHSGKQHSFVYDVADLYKAKHTIPLAFALHKS--AQPDR 259 Query: 246 EVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSL 297 EVR+ R F + + +++ ++ +L P+ +D P V L Sbjct: 260 EVRIRMRQDFHLYRLMPRIVRDVQRLL------DPSIAQDHDETGEPEEVEL 305 >UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=Bacteria RepID=C4X9I5_KLEPN Length = 294 Score = 313 bits (802), Expect = 5e-84, Method: Composition-based stats. Identities = 92/274 (33%), Positives = 147/274 (53%), Gaps = 2/274 (0%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 +P +KD+ ++L+ G++++ D + +D G +PV ++ ++L PGT V+H A Sbjct: 16 RELLPQVKDKYPFLYLERGRLEIDDSSVKWVDADGNVVPLPVATINTLLLGPGTTVTHEA 75 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ A + WVGE + YA+G A + L Q LA D LKV R MF RF Sbjct: 76 IKTATAANCAVCWVGEDSLLFYAAGFLPTADTRNLKAQMALACDASSTLKVARAMFAKRF 135 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + +S+ + G+EGSRVRA Y A++YGV W GR++ P +E D NQ +++ Sbjct: 136 PDADLEGKSLNSMMGMEGSRVRALYQQKAQEYGVGWKGRQFTPGKFELSDLTNQVLTSTN 195 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 + LYG+ + + A GY+P IGF+H+G PL FVYD+AD+ K + AF ++R G D Sbjct: 196 AALYGILCSVVHAMGYSPHIGFIHSGSPLPFVYDLADLYKERLCIDLAFSLSREMAGRYD 255 Query: 245 RE-VRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 + V A R + L + I +++ Sbjct: 256 KHKVSEAFRKRVIALDLLNLIAADINELMGGKGA 289 >UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax=Rhodospirillum centenum SW RepID=B6IWM1_RHOCS Length = 281 Score = 299 bits (766), Expect = 7e-80, Method: Composition-based stats. Identities = 86/275 (31%), Positives = 136/275 (49%), Gaps = 6/275 (2%) Query: 4 LPLNPIPLKDRVSMIFLQYGQIDVIDGAFVL-IDKTGIRTHIPVGSVACIMLEPGTRVSH 62 L IP K R +I+++ ++ + +G+ V+ D G +P + ++L PG+ ++H Sbjct: 8 LDAARIPQKSRNGLIYVERCRLSIDNGSLVIAFDDRGEELELPYQRLNAVLLGPGSSITH 67 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AVR + GT L +VG G R+Y + S QA E R+ V ++M+ Sbjct: 68 DAVRHCSGHGTCLAFVGSDGTRLYTAPPLFDRDSTLARQQATWWAGESTRIMVAKRMYAK 127 Query: 123 RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 RFGE P S++ LRG+E +R+R +Y L+A Q G+ W GRR+D D + D NQ I+ Sbjct: 128 RFGE-TPRATSLDSLRGMEAARIRHSYELIAAQAGIVWRGRRFDRSDPDGDDLPNQAINH 186 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR---- 238 + + A+ A G P +GF+H S+ DI D+ + VP AF +R Sbjct: 187 VVTAVEACVAIAVQATGTLPPLGFLHEDSAKSWTLDICDLYRTSVTVPLAFRCVKRIDQG 246 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 DR R A R + + +I I++VLA Sbjct: 247 ATDSLDRICRRAVSAHVRDTGFIDTIIDDIKEVLA 281 >UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetella anthropi E3_33 E1 RepID=C9M9R9_9BACT Length = 281 Score = 299 bits (766), Expect = 8e-80, Method: Composition-based stats. Identities = 97/272 (35%), Positives = 145/272 (53%), Gaps = 10/272 (3%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTH-----IPVGSVACIMLEPGTRVSHAAVRLAAQV 71 M++L+ G + V DG + G IP +V+ I+LEPGT ++H RL Q Sbjct: 1 MLWLERGNLFVKDGTLRFVSAGGGSLEKGTYDIPYQNVSMIVLEPGTTITHDVFRLMGQQ 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 GT L+ VG+ GVR Y + G RS Q +L + RL+V M+ +RFGE P R Sbjct: 61 GTGLIAVGDKGVRCYTAPPLGPDRSALARRQVELWANPQTRLQVALAMYAIRFGEELPTR 120 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 + +E LRGIEG+R+R +Y++LAK YG+TW RR++ K K D IN ++ A S +YG Sbjct: 121 K-IEDLRGIEGARLRKSYSILAKFYGLTWTLRRFNRKQPNKTDDINAAVNHAASAMYGAA 179 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP----DREV 247 + A+ A P +GFVH +F DIAD+ + + +P AF EP +R V Sbjct: 180 DIAVAAVSAIPQLGFVHAKSCRAFALDIADLYRTEITLPAAFRGLASYLEEPGMDLERHV 239 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 R K ++K+I I++++ G+ Sbjct: 240 RKLIGQELYRQKVISKMIDQIKELILHGQSDE 271 >UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=Q47PJ6_THEFY Length = 332 Score = 296 bits (757), Expect = 9e-79, Method: Composition-based stats. Identities = 92/310 (29%), Positives = 157/310 (50%), Gaps = 21/310 (6%) Query: 10 PLKDRVSMIFLQYGQIDVID-GAFVLIDKTGIRTH---IPVGSVACIMLEPGTRVSHAAV 65 + D +S +++ +I D G ++ R H IP S+AC++L PGT ++ A+ Sbjct: 18 RVSDGLSFLYVDVCRIVQTDTGVCAEVETETGRIHRVPIPTASLACVLLGPGTSITSPAM 77 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG 125 + T +V G G+ Y S + + QA+ D+ R V +M+E+RFG Sbjct: 78 ATFMRHNTTVVTCGAGGILNYGSFPAPNRTTKWIDRQARAYSDDRRRRDVAVRMYEMRFG 137 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 E P S+E+LR +EG+R++A Y LA + V R Y+P DW+ D +N+ +SA+ + Sbjct: 138 EEPPPGASIERLRQLEGARMKALYRSLAAKNRVKPFKRNYNPHDWDDQDPVNKALSASNA 197 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 LYGV + + G PA+GF+H+GK +FVYDIAD+ K T +P AF ++R P++ Sbjct: 198 ALYGVVHSVLAHLGCHPALGFIHSGKQDAFVYDIADLYKARTTIPLAFSLSRT--ANPEQ 255 Query: 246 EVRLACRDIFRSSKTLAKLIPLIEDVLAAGE----IQPPAPP-----------EDAQPVA 290 E RL R + + + +++ ++ +L+ + + PP D A Sbjct: 256 EARLRLRRDLKLYRLIPQIVRDVQTLLSLDDPEEAVSEEEPPSSGGPWQVVDLWDPVVGA 315 Query: 291 IPLPVSLGDA 300 + V+ + Sbjct: 316 VSGGVNYANH 325 >UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodoferax ferrireducens T118 RepID=Q21QB1_RHOFD Length = 277 Score = 294 bits (752), Expect = 3e-78, Method: Composition-based stats. Identities = 105/267 (39%), Positives = 154/267 (57%), Gaps = 9/267 (3%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 K+R+ +FL+ G + V DG +L+ + IP V+C+M+EPG V+H A++L + Sbjct: 17 HKNRIPYLFLEKGILRV-DGHCLLLCQAESAIEIPGSMVSCLMIEPGVSVTHEAMKLCGE 75 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 GTLL+WVGE G R YA+ + ++L QA + ++ R+ +++ L F + P Sbjct: 76 NGTLLMWVGEGGTRFYAAAHA-HQDASRVLRQAAIHTNQRERIAAASRLYGLMFDDHMPP 134 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 ++E+LRG+EGSRV+ Y LA + G+ W GR E+ +N I ATSCLY + Sbjct: 135 SFTIEKLRGLEGSRVKEIYVNLADKLGMVWQGR-------EEKSALNTSIGFATSCLYAL 187 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 E AILAAGY P IG VH+G P S V+D+AD +KF TVVP AFEIA +P + VR Sbjct: 188 CEVAILAAGYHPGIGVVHSGNPRSLVFDLADTVKFKTVVPLAFEIAATSPSNLNMAVRHG 247 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEI 277 CRD+F L+ +E++ Sbjct: 248 CRDLFSRESMFETLLGHLENIFGTDHD 274 >UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria RepID=Q0AA34_ALHEH Length = 298 Score = 291 bits (744), Expect = 3e-77, Method: Composition-based stats. Identities = 91/279 (32%), Positives = 143/279 (51%), Gaps = 10/279 (3%) Query: 9 IPLKDRVSMIFLQYGQIDVIDGAFVLI-----DKTGIRTHIPVGSVACIMLEPGTRVSHA 63 IP DR +++L G++ V DG D IP ++ I+L PG+ V+H Sbjct: 18 IPHVDRHGLLWLTRGRLYVEDGTLHFTAAESEDLAAGDYAIPYQGLSMILLGPGSTVTHD 77 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 +RL A+ GTLL +G G + Y + G RSD A L ++ RL V R+M+ R Sbjct: 78 VLRLLARHGTLLAAIGGGGTKYYTAPPMGQGRSDVARRHATLWANKTQRLDVARRMYAFR 137 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 FG P + + LRGIEG R++ Y + A ++G+ W GRRY+ + D NQ I+ A Sbjct: 138 FGRVLPH-KDIAVLRGIEGGRIKELYRVEASRFGIPWKGRRYNRNNPSAADVPNQAINHA 196 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 + + + A+ A G P +GF+H +F DIAD+ + + VP AF+ AR+ +P Sbjct: 197 ATFVEAAADIAVAATGALPPLGFIHEESSNAFTLDIADLYRGEITVPLAFQAARKVLDDP 256 Query: 244 ----DREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 +R +R F+ K + K+I I+D++ A + Sbjct: 257 TLSIERTLRRDAASAFQRHKVIPKMIDRIKDLINADDNG 295 >UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Actinomycetales RepID=C2KP50_9ACTO Length = 312 Score = 278 bits (711), Expect = 2e-73, Method: Composition-based stats. Identities = 72/278 (25%), Positives = 129/278 (46%), Gaps = 17/278 (6%) Query: 7 NPIPLKDRVSMIFLQYGQIDVI--------DGAFVLIDKTGIR--THIPVGSVACIMLEP 56 I L+DRVS ++L+Y Q+ +G D+ ++ IPV +A + L P Sbjct: 17 EQIRLEDRVSYLYLEYCQVIQNHTGVAAISEGNHDSEDREPLKRIIQIPVAGLAVLFLGP 76 Query: 57 GTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVV 116 GT ++ A+ ++ G +++ G G Y+ + S + QA L DE K Sbjct: 77 GTSITQPAMASCSRAGLTVIFSGGGGCPYYSHAMALTSSSRWAIAQAHLVADERNARKAA 136 Query: 117 RKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 + +++ + G ++ Q+RG+EGS ++ Y L++++ V R D D + Sbjct: 137 KFLYKRQLGIDIERELTISQMRGLEGSLIKKRYRELSREFKVNGFRR-----DTGGEDVL 191 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIA 236 NQ ++ LYG +A A G PA+G +H G S ++D+AD+ K + +P +F Sbjct: 192 NQALNLVNGILYGCAASACAALGVNPALGIIHRGDIRSLLFDVADLYKPNAALPISFRSV 251 Query: 237 RRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 ++ EP + R R L +I ++ +VL Sbjct: 252 SKD--EPLKFARKEMRRFIYEQNVLENMISILMNVLEP 287 >UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=Nitrosococcus oceani RepID=Q3J7J6_NITOC Length = 295 Score = 257 bits (657), Expect = 3e-67, Method: Composition-based stats. Identities = 76/294 (25%), Positives = 121/294 (41%), Gaps = 33/294 (11%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTH---IPVGSVACIMLEPGTRVSHAA 64 PI R + +L++ ++ D V + G T IP + I+L GT ++ AA Sbjct: 2 PILPSHRQGLYYLEHCRVMAKDERVVYACQEGAFTKFFAIPPANTNVILLGSGTSLTQAA 61 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR- 123 RL A ++ +VG G ++ + Q ++ +L D D RLKV + R Sbjct: 62 ARLLASEQVMVAFVGGGGSPLFLASQNEYRPTEYCQAWMRLWQDNDQRLKVAKTFQRNRA 121 Query: 124 ---------FGEPAPARRSVE----------QLRGIEGSRVRATYALLAKQYGVTW---- 160 EP P + S+E +L G G+ + A A AK+ W Sbjct: 122 EFLMQQWPKLAEPKPHKASLEKLAERYLADIELAGDNGT-ILAQEAKFAKKLYKFWANCT 180 Query: 161 --NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH-TGKPLSFVY 217 DP + D N + +YG+ A + G ++ +H T + + V+ Sbjct: 181 ETENFTRDPGKRDFNDPFNSYLDHGNYLVYGIAAAVLWVLGIPHSLPVIHGTTRRGALVF 240 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 D+ADIIK V+P AF+ A G D+E+R AC S + L I+ V Sbjct: 241 DVADIIKDTCVMPIAFQHAA--AGRSDQEMRQACIAWLDESHAMTFLFQSIKRV 292 >UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobacteria RepID=C8Q0H7_9GAMM Length = 334 Score = 236 bits (601), Expect = 9e-61, Method: Composition-based stats. Identities = 58/302 (19%), Positives = 119/302 (39%), Gaps = 42/302 (13%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKT----GIRTHIPVGSVACIMLEPGTRVSH 62 P+ L R + +L+ ++ + D V + ++ +IP + A ++L G+ ++ Sbjct: 27 RPLMLSKRACVFYLERVRVILKDDRIVYLTESMQPIEHFYNIPEKNTAFLLLGKGSSITD 86 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYAS-------GQPGGARSDKLLYQAKLALDEDLRL-- 113 AA R A+ ++ + G G ++++ Q ++ + K LD+ RL Sbjct: 87 AAARRLAESNVMVGFCGSGGSPLFSALDLTFLAPQSEYRPTEYMQIWMKAWLDDTTRLLM 146 Query: 114 ---------KVVRKMFE-----------------LRFGEPAPARRSVEQLRGIEGSRVRA 147 ++V+K ++ + F + + + EQL EG + Sbjct: 147 AKVLLQERIEIVKKYWQKNPLLTSYGIRLDESAVVNFSQAIESAMNQEQLLTAEGRWAKV 206 Query: 148 TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 Y LA+ G + + + D N + YG A+ G + A+ + Sbjct: 207 LYKSLAEGCGFKFTREEGKNANDDIADIANSYLDHGNYIAYGYAAVALHGLGISFALPML 266 Query: 208 HTGKPL-SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIP 266 H V+D+AD++K V+P+AF A+ G +E R+ +I + L + Sbjct: 267 HGKTRRGGLVFDVADLVKDAMVMPQAFISAK--LGHNQKEFRMQLIEICQDQDVLDYMFG 324 Query: 267 LI 268 + Sbjct: 325 FV 326 >UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria RepID=C6CA70_DICDC Length = 333 Score = 231 bits (588), Expect = 3e-59, Method: Composition-based stats. Identities = 69/311 (22%), Positives = 126/311 (40%), Gaps = 46/311 (14%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRT---HIPVGSVACIMLEPGTRVS 61 L I R ++ +LQ+ +I V G + + G ++ +IP+ + + +ML GT V+ Sbjct: 9 DLKTILHSKRANIYYLQHCRILVNGGRVEYVTEEGNQSLYWNIPIANTSVVMLGTGTSVT 68 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-----------GQPGGARSDKLLYQAKLALDED 110 AA+R A+ G ++ + G G ++A+ Q ++ L +E Sbjct: 69 QAAMREFARAGVMVGFCGSGGTPLFAANEAEVAVSWLSPQSEYRPTEYLQDWVSFWFNEQ 128 Query: 111 LRLKVVRKMFELRFGE----------PAPARRSV-----EQL-----RGIEGSR----VR 146 RL ++R G+ +R ++ E L +G+ R V Sbjct: 129 QRLAAAIAFQQVRIGQIRQHWLGGRLARESRFTIKPEHVEALLNRYQQGLVDCRTSNDVL 188 Query: 147 ATYALLAKQ-YGVTWNGRRY----DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYA 201 A++ K Y + N Y K D N+ + YG+ A+ G Sbjct: 189 VQEAMMTKALYRLAANAVSYGDFTRAKRGGGTDLANRFLDHGNYLAYGLAAVALWVLGLP 248 Query: 202 PAIGFVHTGKPL-SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKT 260 + +H V+D+AD+IK ++P+AF A GE +++ R C FR ++ Sbjct: 249 HGLAVLHGKTRRGGLVFDVADLIKDALILPQAFIAAME--GEDEQDFRQRCLTAFRQAEA 306 Query: 261 LAKLIPLIEDV 271 L +I ++ V Sbjct: 307 LDVMIDSLQQV 317 >UniRef50_B7GYY4 CRISPR-associated protein Cas1 n=8 Tax=Proteobacteria RepID=B7GYY4_ACIB3 Length = 321 Score = 213 bits (543), Expect = 5e-54, Method: Composition-based stats. Identities = 57/308 (18%), Positives = 112/308 (36%), Gaps = 51/308 (16%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIR---THIPVGSVACIMLEPGTRVS 61 L I R ++ +L++ ++ DG + + + +IP+ + I+L GT ++ Sbjct: 8 DLKAILHSKRANLYYLEHCRVMQKDGRVLYLTEAKNENQYWNIPIANTTVILLGTGTSIT 67 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-------GQPGGARSDKLLYQAKLALDEDLRLK 114 AA+R+ G L+ + G G ++A Q ++ + DE RL Sbjct: 68 QAAMRMLCSAGVLVGFCGGGGTPLFAGSEVEWLTPQSEYRPTEYMQGWMSFWFDETKRLD 127 Query: 115 VVRKMFELR--------------------------------FGEPAPARRSVEQLRGIEG 142 V + R F + P V L E Sbjct: 128 VAKAFQFARIEFIRKIWAKDKDLKDEGFYLDNLDIQQALNGFEKKIPNMTKVGDLLLAEA 187 Query: 143 SRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAP 202 + Y + A + +++ + E+GD N ++ YG++ + G + Sbjct: 188 QTTKQLYKIAATRCKLSFER------NPEQGDLANDFLNHGNYLAYGLSATTLWVLGISH 241 Query: 203 AIGFVHTGKPL-SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTL 261 + +H + V+D+AD+IK V+P AF A+ ++E R F + L Sbjct: 242 SFAVMHGKTRRGALVFDVADLIKDAVVLPWAFICAKEGAT--EQEFRQQLLQKFTDYRCL 299 Query: 262 AKLIPLIE 269 + ++ Sbjct: 300 DWMFDQVK 307 >UniRef50_A9BUF1 CRISPR-associated protein Cas1 n=33 Tax=Proteobacteria RepID=A9BUF1_DELAS Length = 337 Score = 199 bits (506), Expect = 1e-49, Method: Composition-based stats. Identities = 66/334 (19%), Positives = 122/334 (36%), Gaps = 56/334 (16%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRT---HIPVGSVACIMLEPGTRVS 61 L I R ++ +L++ ++ V G + G R+ +IP+ + I+L GT ++ Sbjct: 8 DLKTILHSKRANIYYLEHCRVLVNGGRVEYVTDAGKRSLYWNIPIANTTSILLGAGTSIT 67 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-----------GQPGGARSDKLLYQAKLALDED 110 AA+R A+ G L+ + G G ++++ Q ++ L + D++ Sbjct: 68 QAAMRELAKAGVLVGFCGGGGTPLFSANEVDVEVAWLTPQSEYRPTEYLQAWVQFWFDDE 127 Query: 111 LRLKVVRKMFELR-------------------------------FGEPAPARRSVEQLRG 139 LRL +++ LR F V L Sbjct: 128 LRLHAAKQLQALRLQRLQQEWGARALRESGFAVDMERLKALVQQFAALMANAPDVMTLLT 187 Query: 140 IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAG 199 E +A + L G G K D N+ + YG+ A G Sbjct: 188 DEARLTKALFKLAVDAVGY---GEFTRAKRGTGTDGANRYLDHGNYLAYGLGATATWVLG 244 Query: 200 YAPAIGFVHTGKPL-SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 + +H V+D AD++K ++P+AF A R G+ + + R C + S Sbjct: 245 LPHGLAVLHGKTRRGGLVFDAADLVKDAAILPQAFLSAMR--GDDEPQFRRQCIEALTRS 302 Query: 259 KTLAKLIPLIEDVL-----AAGEIQPPAPPEDAQ 287 ++L +I ++ + AG P ++A Sbjct: 303 ESLDFIIDTLKRIAVDTARMAGASATPGAAQEAG 336 >UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonella parvula DSM 2008 RepID=D1BQ37_VEIPT Length = 331 Score = 145 bits (367), Expect = 1e-33, Method: Composition-based stats. Identities = 60/284 (21%), Positives = 109/284 (38%), Gaps = 39/284 (13%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 I G VL + IP+ +V+ ++L ++S + + G+L+ +V + + Sbjct: 14 IRQDGGLLVLEKDHSVIKEIPIATVSNLVLGRTIQISTQVMFSLVKQGSLIQFV-DHKYQ 72 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA---------------- 128 + + +LL+Q +++ L + + + Sbjct: 73 LVGTLGDEHTPLQRLLWQVACFQNQEFALDGAKYIVRRKIKGQIALLNQYKKSKCIPNFV 132 Query: 129 -------------PARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYDPKDWEKG 173 + VE LRGIEG R +++L W +GR+ P Sbjct: 133 VVHRTMQALLKRVERTKKVETLRGIEGLASRTYFSVLGHVLSEPWEFSGRKRHP----SP 188 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPK 231 D +N +S S L A +L AG IG +H+ + S VYD+ DI + D + Sbjct: 189 DPVNAILSYGYSFLEREVRACLLTAGLDVRIGVLHSTNNRKDSLVYDVMDIFRQDIIDRF 248 Query: 232 AFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 ++ R+ P+ + L+ R F S + K + L ED + A Sbjct: 249 VLKLLNRHMILPE-DFDLSERGCFLSKEANKKWVELYEDYMKAE 291 >UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma linguale DSM 74 RepID=D2QT50_9SPHI Length = 351 Score = 116 bits (291), Expect = 9e-25, Method: Composition-based stats. Identities = 56/252 (22%), Positives = 90/252 (35%), Gaps = 59/252 (23%) Query: 46 VGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLL----- 100 V I+L GT +S AVRLA + +V++ + G + + K+ Sbjct: 37 AEKVTHILLATGTSLSTDAVRLAMRHNVDIVFIEQQGDPIGRVWHAKLGSTTKIRKRQLE 96 Query: 101 --------YQAKLA--LDEDLRLKVVRKMFELR--------------------------F 124 + D ++ +R + + R Sbjct: 97 ASLGPDGLRWVRAWLLAKLDNQMGFIRSLKKHRPQHAGYLDDKLVRIEAMALSISTLASV 156 Query: 125 GEPAPARRSV----EQLRGIEGSRVRATYALLA----KQYGVTWNGRRYDPKDWEKGDTI 176 GE PA V + LRG+EG+ R + L+ K+Y ++GR P D Sbjct: 157 GEQTPATTCVADVADTLRGLEGTAGRLYFETLSYVLPKEY--QFSGRSSRPAQ----DAF 210 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKA 232 N ++ LYG E ++ AG P +GF+H LS VYD + + D VV + Sbjct: 211 NAFLNYGYGMLYGKVEKTLMMAGLDPYVGFLHRDDYNQLSMVYDFIEPYRGWTDEVVFRL 270 Query: 233 FEIARRNPGEPD 244 F + N Sbjct: 271 FTAKKVNKAHIG 282 >UniRef50_A3LCN8 Putative uncharacterized protein n=1 Tax=Pseudomonas aeruginosa 2192 RepID=A3LCN8_PSEAE Length = 112 Score = 110 bits (275), Expect = 6e-23, Method: Composition-based stats. Identities = 64/78 (82%), Positives = 73/78 (93%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 PL P+P+KDR+SM+F+QYGQIDV DGAFV+ID+TG+R HIPVGSVACIMLEPGTRVSHAA Sbjct: 4 PLKPLPMKDRLSMVFVQYGQIDVRDGAFVVIDQTGVRMHIPVGSVACIMLEPGTRVSHAA 63 Query: 65 VRLAAQVGTLLVWVGEAG 82 V LA+ VGTLLVWVGEAG Sbjct: 64 VHLASTVGTLLVWVGEAG 81 >UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfotomaculum acetoxidans DSM 771 RepID=C8W2P4_DESAS Length = 545 Score = 98.5 bits (244), Expect = 3e-19, Method: Composition-based stats. Identities = 51/289 (17%), Positives = 100/289 (34%), Gaps = 40/289 (13%) Query: 5 PLNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P P+P + ++++ + + + IP+ ++ ++L +S Sbjct: 203 PARPLPSLNLGRVLYVDEPGAYVRKKGERVQVTRDKEVLVDIPLCNLEQLVLAGTVNISA 262 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE- 121 ++L GT + +V AG + S + Q K D +LRLK + Sbjct: 263 QVIKLLLDRGTEVHFVSRAGKYYGSLQPALTKNSALRIAQHKAYQDMELRLKYAVLFVQG 322 Query: 122 ---------LRFGEPAPARR-------------------SVEQLRGIEGSRVRATYALLA 153 LR+ ++ S+ L GIEG+ R + + Sbjct: 323 KLANMRTILLRYNRDLKEKQLEEAICRLKSLSKNLYKADSLNSLMGIEGAATREYFRVFN 382 Query: 154 K--QYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT-- 209 + V +N ++ + GD +N +S A + L A++ GY P IGF+H Sbjct: 383 YMIKQHVPFNFQQRSRRPP--GDPVNALLSFAYTLLTKDMIASVSIVGYDPYIGFLHRSD 440 Query: 210 -GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 G+P + D + + + + D + F + Sbjct: 441 YGRP-ALALDFIEEFRPIVADSVVLTVLNKGMINTD-DFEYKMGGCFLN 487 >UniRef50_B9LWK7 CRISPR-associated protein Cas1 n=4 Tax=Halobacteriaceae RepID=B9LWK7_HALLT Length = 331 Score = 95.1 bits (235), Expect = 2e-18, Method: Composition-based stats. Identities = 50/256 (19%), Positives = 82/256 (32%), Gaps = 40/256 (15%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G++ + + G H+PV S+ + L + A+ L G G Sbjct: 12 GELSRSEDTLRIDTLDGEVEHLPVESIDTLYLHGQIDFNTRALGLLNDHGVPAHVFGWKD 71 Query: 83 VRVYASGQPGGAR---SDKLLYQAKLALDEDLRLKVVRKM--------------FELR-- 123 Y + ++ Q + D D RL + M + R Sbjct: 72 --YYKGSYLPKRSHLSGNTVVEQVRAYDDPDRRLGIATLMIEASIHNMRANLVYYNARDC 129 Query: 124 -FGEPAPA----------RRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYDPKDW 170 F S++ LRG E + + Y+ ++ + + R Y+P Sbjct: 130 SFDSEIDRLESLKTKASTAESIDGLRGTEATARKTYYSCFSEILRDPFALDRREYNPP-- 187 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFV--YDIADIIKFDTV 228 + N IS + +Y +AI P +GF+H F DIADI K Sbjct: 188 --TNETNALISFLNAMVYTACVSAIRKTALDPTVGFMHEPGDRRFTLSLDIADIYKPILA 245 Query: 229 VPKAFEIARRNPGEPD 244 F + R PD Sbjct: 246 DRVLFRLVNRRQISPD 261 >UniRef50_A4X3M4 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=A4X3M4_SALTO Length = 326 Score = 93.5 bits (231), Expect = 7e-18, Method: Composition-based stats. Identities = 52/266 (19%), Positives = 86/266 (32%), Gaps = 38/266 (14%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 S + +I D + + G IP+ + ++L ++ AAV L ++ G + Sbjct: 7 SYWLTEPCRIRREDNSIRIERADGQPVRIPITDIRDLVLFDNADINTAAVSLLSRHGVTV 66 Query: 76 VWVGEAGVRVYASGQPGG---ARSDKLLYQAKLALDEDLRLKVVRKMFEL---------- 122 + G YA + + + Q L + RL V + + Sbjct: 67 HLLDHYG--NYAGALTPADDMSSAHVVRAQVALTGNPQARLAVAQALVRATAVNVAWALG 124 Query: 123 ---------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW---NGRRYDPKDW 170 R A S L G+EG+ R + +L W +GR P Sbjct: 125 TDLLDGPLERLPAQIGASTSSGDLMGVEGNFRRTAWGVL-DTLLPPWLRLDGRTRRPP-- 181 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL---SFVYDIADIIKFDT 227 + N IS + Y AI PAIGF+H + D+A+ K Sbjct: 182 --SNAGNAFISYLNAITYARVLTAIRCTPLHPAIGFLHADTDRRRNTLALDLAEPFKPLL 239 Query: 228 VVPKAFEIARRN---PGEPDREVRLA 250 A + + +VR A Sbjct: 240 AERLLRRAAAQRTLTAADFVSDVRSA 265 >UniRef50_O57912 Putative uncharacterized protein PH0173 n=1 Tax=Pyrococcus horikoshii RepID=O57912_PYRHO Length = 317 Score = 91.6 bits (226), Expect = 3e-17, Method: Composition-based stats. Identities = 49/257 (19%), Positives = 94/257 (36%), Gaps = 31/257 (12%) Query: 16 SMIFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 S I++ Q G ++ +++ ++ +P+ +++ I ++ A++L + Sbjct: 3 SPIYITQPGILERKANTLFFVNEE-MKRALPINTISEIHCFAPVTLTSGAIKLLSDNDVP 61 Query: 75 LVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE---------- 121 + + + G Y ++ QA +D + RL + R++ E Sbjct: 62 VHFYNKYG--YYRGSYLPAESQISGSIVVAQASHYIDNEKRLYIAREILEGTRASMISLL 119 Query: 122 -------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEKG 173 + S+E+L GIE + YA + ++ R P Sbjct: 120 KSQRAEYKDLADIDLKGESIEELMGIESQLWKTFYAHFSHLLKFFNFDERNRRPPR---- 175 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPK 231 D IN IS S LY VT + I P I ++H + S D+A+I K V Sbjct: 176 DEINAMISYGNSVLYTVTLSEIRKTYLHPGISYLHEPRERRYSLALDLAEIFKPIVVFRV 235 Query: 232 AFEIARRNPGEPDREVR 248 + + + VR Sbjct: 236 ILRLVNKRIIREEHFVR 252 >UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscilla marina ATCC 23134 RepID=A1ZHZ5_9SPHI Length = 344 Score = 89.7 bits (221), Expect = 1e-16, Method: Composition-based stats. Identities = 36/127 (28%), Positives = 53/127 (41%), Gaps = 14/127 (11%) Query: 126 EPAPARRSVEQLRGIEGSRVRATYALL----AKQYGVTWNGRRYDPKDWEKGDTINQCIS 181 E + LRG+EG+ R + L AK+Y + GR P D N ++ Sbjct: 159 EADHTEAIADTLRGLEGTAGRLYFETLSYVLAKEY--QFAGRSKRPAH----DAFNAFLN 212 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKAFEIAR 237 LY + E A++ AG P +GF+H S VYD + + + VV K F + Sbjct: 213 YGYGILYSMVEHALVIAGIDPFVGFMHRDGYNQRSMVYDFIEPYRGHVEQVVVKLFTAKK 272 Query: 238 RNPGEPD 244 N D Sbjct: 273 VNQSHTD 279 >UniRef50_A9AYP8 CRISPR-associated protein Cas1 n=1 Tax=Herpetosiphon aurantiacus ATCC 23779 RepID=A9AYP8_HERA2 Length = 333 Score = 89.7 bits (221), Expect = 1e-16, Method: Composition-based stats. Identities = 58/295 (19%), Positives = 106/295 (35%), Gaps = 43/295 (14%) Query: 18 IFL-QYG-QIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTL 74 ++L + G ++ D +++ + IPV V I++ G +VSHAA+ AQ G Sbjct: 4 LYLNEQGTRLGKKDERLIILRGQELINDIPVIKVDRVIVMGQGVQVSHAAIVFLAQRGIP 63 Query: 75 LVWVG-EAGVRVYASGQPGGARSDKLLYQAKLALDEDLRL---------KVVRKMFEL-R 123 L++ G + G + L Q ++ + L + KV ++ L R Sbjct: 64 LIFTTQSGGSQKAMVSAGLGNNAALRLAQCRIVDNPHLAVPLVQAIVVGKVANQIQLLER 123 Query: 124 FG------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGR 163 +G + +EQLRG+EG+ A + + + W GR Sbjct: 124 YGSDWGGMGLRAKQTMQHVIQQTQHMPDIEQLRGLEGAGAAAYWGTWSAVFKTAWGFAGR 183 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIA 220 Y P D +N +S + L A+ A + P +G HT G+P S D+ Sbjct: 184 AYRPT----PDPLNALLSFGYTLLLNDLMTAVQALSFDPYLGVFHTVQFGRP-SLALDLE 238 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + + V ++ + + + + I E + Sbjct: 239 EEFRPCIVDRMVLDVLDAGLLQM-SNFSRTEKGFLLNDRARKSFIQAYEQRMQTP 292 >UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular organisms RepID=B8CYA1_HALOH Length = 325 Score = 88.9 bits (219), Expect = 2e-16, Method: Composition-based stats. Identities = 56/266 (21%), Positives = 97/266 (36%), Gaps = 52/266 (19%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + V +F + + + + V+ I++ G +S AV+LA + + ++ E G Sbjct: 12 LHVKQKSFEIKTEEDKK-RVSAKKVSSILITTGAAISTDAVKLALENNIEIQFLDEFGCS 70 Query: 85 VYASGQPGGARSDKLLY-QAKLALDED-------------------------LRLKVVRK 118 + P + + Q +LA E+ R K K Sbjct: 71 LGKVWHPKLGSTTYIRRKQLELAESEEGTELVKEFMLDKIDNMINHLHDLAIKRSKSKEK 130 Query: 119 MFELRFGEPAPARRSVE-----------QLRGIEGSRVRATYALLA----KQYGVTWNGR 163 + E R +E + G EG+ R +A L+ +Y +NGR Sbjct: 131 YINKKIKEICELRNKLEKVTGYIEDVRNTIMGYEGNISRKYFASLSFLLPDRY--KFNGR 188 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 + P + D N ++ LYG E A++ AG P +G +HT SFV+D + Sbjct: 189 SFRPAE----DEFNCLLNYGYGVLYGKVEKALIIAGLDPYVGILHTDGYNKKSFVFDFIE 244 Query: 222 IIKFDT--VVPKAFEIARRNPGEPDR 245 + VV K F + D+ Sbjct: 245 PYRHHIDRVVMKLFSRKKIRKLHFDK 270 >UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=Methanospirillum hungatei JF-1 RepID=Q2FL78_METHJ Length = 331 Score = 88.9 bits (219), Expect = 2e-16, Method: Composition-based stats. Identities = 53/261 (20%), Positives = 93/261 (35%), Gaps = 42/261 (16%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVA-----CIMLEPGTRVSHAAVRLAAQVGTLLVWV 78 +I + T P +++ + + +S AAVRL G + + Sbjct: 12 RIRKRGDVLTIETGKDSDTAEPPRTLSPLGLDLLAIAGDHSISTAAVRLVTSHGGAIALM 71 Query: 79 GEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM------------------- 119 G + P G + Y+A+ + E+ RL++ R + Sbjct: 72 DGLGNP-FGHFLPLGRSALIEQYEAQASAPEERRLEIARSICTGALENKRTLLSNLERIR 130 Query: 120 -----FELRFGEPAPARR----SVEQLRGIEGSRVRATYALLAKQYGVTWN--GRRYDPK 168 E+R E A + S++ LRG+EGS A + + + W GR +P Sbjct: 131 GFDLSREIRLVEDAQDKALECQSLDSLRGVEGSGAHAYFQGFSLAFDEEWGFLGRSQNPA 190 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFD 226 D +N +S LY A++ +GY+P G H K + VYD+ + + Sbjct: 191 ----TDPVNSLLSYGYGMLYIQARQALVLSGYSPYYGAYHETYKKQEALVYDLVEEFRQP 246 Query: 227 TVVPKAFEIARRNPGEPDREV 247 V ++ PD Sbjct: 247 VVDRTVVTFLAKHMATPDDFT 267 >UniRef50_D2NTT1 Uncharacterized protein predicted to be involved in DNA repair n=2 Tax=Rothia mucilaginosa RepID=D2NTT1_9MICC Length = 560 Score = 88.1 bits (217), Expect = 3e-16, Method: Composition-based stats. Identities = 44/248 (17%), Positives = 75/248 (30%), Gaps = 39/248 (15%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 + V G ++ + +P+ V +++ VS A +R ++W G Sbjct: 243 RATVQQGRLIVQHLGETISSVPLERVHSLVVHGNIDVSSALLRELMWRNCTIIWCSSTG- 301 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP---------------- 127 RVY QPG + Q + L + M + Sbjct: 302 RVYGWSQPGTGPNGLARVQQHVLS-AQGYLPIASAMIASKIANQATLLRRNGHAADVCRT 360 Query: 128 -------APARRSVEQLRGIEGSRVRATYALLAK--------QYGVTWNGRRYDPKDWEK 172 P S+ +L G+EG + A + G W GR+ Sbjct: 361 MRDIQKNTPQATSIPELLGLEGEAASLYFGNFATMLKEDALTELGWIWTGRQGR----GA 416 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVP 230 D IN ++ A L IL G P GF+H+ + D+ + + Sbjct: 417 NDPINILLNYAYGMLSSECIRGILTCGLDPHAGFLHSSSRNKPALALDLMEEFRAVIADS 476 Query: 231 KAFEIARR 238 + R Sbjct: 477 VVVSLINR 484 >UniRef50_B0VIK5 Putative uncharacterized protein n=1 Tax=Candidatus Cloacamonas acidaminovorans RepID=B0VIK5_9BACT Length = 353 Score = 87.4 bits (215), Expect = 6e-16, Method: Composition-based stats. Identities = 43/262 (16%), Positives = 83/262 (31%), Gaps = 45/262 (17%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 YG + +T + ++ I++ ++ A++LA +V++ + Sbjct: 9 YGSYLRKKDEMFELSIEDRKTKLSPEKISSIVISNAATITTDAIQLAMDYNIDIVFLDKY 68 Query: 82 GVRVYASGQPGGARSDKLLY-------------------------QAKLALDEDLRLKVV 116 G P + + Q + + Sbjct: 69 GSPYGRIWFPKIGSTVLIRRRQLEMLSDNVGLQFIKDWIAIKIMNQYRFVQRLISKRDCD 128 Query: 117 RKMFELRFGEPAPARRSVEQ-----------LRGIEGSRVRATYALLAKQY--GVTWNGR 163 + +F+ R A + Q G EG + +++LA+ + GR Sbjct: 129 KSIFQTRMQNMQEAAICIMQAEGKLEELSGSFMGWEGGASKNYFSILAELIPDAYKFEGR 188 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 P D N ++ LY E A++ AG P +G +H+ SFV+D + Sbjct: 189 SSRPAK----DAFNAMLNYGYGILYSKVERALIIAGLDPYLGLLHSDNYNKKSFVFDFIE 244 Query: 222 IIKFDTVVPKAFEIARRNPGEP 243 + P F I R+ P Sbjct: 245 PYRILVDEPV-FYIFSRHKFSP 265 >UniRef50_Q1CWU5 CRISPR-associated protein Cas1 n=1 Tax=Myxococcus xanthus DK 1622 RepID=Q1CWU5_MYXXD Length = 342 Score = 86.2 bits (212), Expect = 1e-15, Method: Composition-based stats. Identities = 47/275 (17%), Positives = 89/275 (32%), Gaps = 38/275 (13%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I + +++ V+ + + +P+ + ++ ++ + + G + Sbjct: 9 FITAEGTRLNKEGECVVVTVQDQKKAEVPLRHLRSVVCLTRAWLTPELMESCLEAGIHVS 68 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLR------------LKVVRKMF---- 120 + G G R A + + L Q A D+ R L R+ Sbjct: 69 FFGMTG-RFLARVEGVPGGNVLLRRQQYRAADDIARSVAISRALVVGKLGNARQFVLHAR 127 Query: 121 --------------ELRFGEPAPARRSVEQL---RGIEGSRVRATYALLAKQYGVTWNGR 163 R E A VE L RG+EG R + + G Sbjct: 128 RDAAPERQEALSETARRLSEHLRALTRVEDLVQVRGLEGIAARDYFESFPALLKKSAQGF 187 Query: 164 RYDPKDWEKG-DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP--LSFVYDIA 220 +D ++ + +N +S + L A+ G PA+GF+H +P LS D+ Sbjct: 188 EFDGRNRRPPRNPLNAMLSFGYALLAQDCAGALTGVGLDPAVGFLHEDRPGRLSLALDLM 247 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIF 255 + + V F + R +P + R Sbjct: 248 EEFRAPVVDRLVFSLVNRGQLKPG-DFRTESAGAV 281 >UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=Sulfolobus tokodaii RepID=Q96X75_SULTO Length = 317 Score = 86.2 bits (212), Expect = 1e-15, Method: Composition-based stats. Identities = 53/254 (20%), Positives = 96/254 (37%), Gaps = 29/254 (11%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ FV+ K G + ++ V +++ G V+ A+RLA G ++++ G Sbjct: 25 KLSTKGKTFVISKKDGKKVNVSPAEVDQIVIMTSGVTVTSKAIRLALDHGIDIIFLDSRG 84 Query: 83 V---RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----------KMFELRFGEPAP 129 R++ S + K Y A L +E++ ++++ K + + G Sbjct: 85 NPFGRLFHSEPIKTVETRKAQYLAILKGEEEIPREIIKSKIKNQANHIKFWFKKLGIEGN 144 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + +E E + R + L + + GR D E D N + A + LY Sbjct: 145 DYKLIEGKDDDEATAARYYWHALGRI--IPMKGR-----DPESTDPFNVSFNYAYAILYS 197 Query: 190 VTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVV-PKA--FE---IARRNPG 241 + + G P GF+H + S VYD +++ K V P F I G Sbjct: 198 NIQRVLQLVGLDPYAGFIHKDRSGKPSLVYDFSEMFKPVLVDYPLVSLFINGFIPNVKDG 257 Query: 242 EPDREVRLACRDIF 255 D E R + Sbjct: 258 ILDAESRKKIAEAV 271 >UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwenhoekiella blandensis MED217 RepID=A3XI90_9FLAO Length = 332 Score = 85.8 bits (211), Expect = 2e-15, Method: Composition-based stats. Identities = 55/247 (22%), Positives = 82/247 (33%), Gaps = 51/247 (20%) Query: 43 HIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRV--YASGQPGGAR----- 95 HI V ++ G ++ A+ LA + +V V G + + + G Sbjct: 35 HIAAHKVTSFIVSKGAALTTDAIALALKHNIDIVLVENNGHPMGRFWHSKLGSTTKIRKQ 94 Query: 96 ------------------SDKLLYQAKLALDEDLRLK----------VVRKMFELRFGEP 127 S KL QA D K F + E Sbjct: 95 QLVASLNQTGVYWIKEWLSQKLENQADYLNDLKKHRKNLHVYLDEKSAAILGFRKKIKEA 154 Query: 128 APARRSV--EQLRGIEGSRVRATYALLA----KQYGVTWNGRRYDPKDWEKGDTINQCIS 181 A + E RG EGS R + LA Y + GR + P D N ++ Sbjct: 155 DGADINQLAESFRGWEGSAGRHYFEALATCIPDAYS--FKGRSFRPAQ----DEFNALLN 208 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKAFEIAR 237 A LY E A++ AG P +GF+H S VYD + + + V K+F + Sbjct: 209 YAYGILYSRVERALMLAGLDPFVGFMHRDDYNSKSLVYDFIEPYRIYAERFVFKSFSSKK 268 Query: 238 RNPGEPD 244 N + Sbjct: 269 MNKSYFE 275 >UniRef50_C7QUZ4 CRISPR-associated protein Cas1 n=8 Tax=Cyanobacteria RepID=C7QUZ4_CYAP0 Length = 325 Score = 85.4 bits (210), Expect = 2e-15, Method: Composition-based stats. Identities = 47/259 (18%), Positives = 95/259 (36%), Gaps = 34/259 (13%) Query: 15 VSMIFLQY--GQIDVIDGAFVL----IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +S+++L + AF + D + + IP +V I+L ++ A+ A Sbjct: 1 MSILYLTQPDAVLSKKQEAFHVALKQEDGSWKKQLIPAQTVEQIVLIGYPSITGEALCYA 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----------- 117 ++G + ++ G + ++ L Q + +E+ RL +V+ Sbjct: 61 LELGIPVHYLSCFGKYLGSALPGYSRNGQLRLAQYHVHDNEEQRLALVKTVVTGKIHNQY 120 Query: 118 ----KMFELRFG-----EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWN--GRRYD 166 + + + ++ ++EQ+RG+EG + + WN GR Sbjct: 121 HVLYRYQQKDNPLKEHKQLVKSKTTLEQVRGVEGLAAKDYFNGFKLILDSQWNFNGRNRR 180 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 P D +N +S A L AA+ AG P IG++H + V D+ + + Sbjct: 181 PP----TDPVNALLSFAYGLLRVQVTAAVHIAGLDPYIGYLHETTRGQPAMVLDLMEEFR 236 Query: 225 FDTVVPKAFEIARRNPGEP 243 + +P Sbjct: 237 PLIADSLVLSVISHKEIKP 255 >UniRef50_C1ZJF3 CRISPR-associated protein, Cas1 family; CRISPR-associated exonuclease, Cas4 family n=1 Tax=Planctomyces limnophilus DSM 3776 RepID=C1ZJF3_PLALI Length = 598 Score = 85.1 bits (209), Expect = 3e-15, Method: Composition-based stats. Identities = 54/288 (18%), Positives = 92/288 (31%), Gaps = 39/288 (13%) Query: 6 LNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 +P +D I++Q I L IP+ V+ + L +V+ A Sbjct: 261 RKLVPARDDALPIYVQDQGTYIGKDGERLKLTPAKSSPLFIPLIQVSQVCLMGNVQVTAA 320 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE-- 121 A+R A + + G + + + Q+K A D L + R Sbjct: 321 AIRELADRNIPISYFSYGGWFTALTSGMCHKNVELRMAQSKAAFDPQAALSIARGFISAK 380 Query: 122 -------LRFGEPAPARR----------------SVEQLRGIEGSRVRATYALLAK--QY 156 LR R ++ L G+EG + +A ++ + Sbjct: 381 IKNSRTLLRRHADDKHRSDLDRLADYIQKVEQVDNLNSLMGLEGMAAKTYFAGFSRLLRG 440 Query: 157 GVTWN--GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GK 211 G +N GR P D +N +S S L A G+ P +GF+H G+ Sbjct: 441 GDEFNLEGRNRRPP----TDPVNALLSFVYSLLTKELTITTQAVGFDPFLGFLHQPRYGR 496 Query: 212 PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSK 259 P S D+A+ + + P +R A + Sbjct: 497 P-SLALDLAEEFRPLVGDSTVLTLINNEEVSPKSFIRRAGSVALTETG 543 >UniRef50_C7NA04 CRISPR-associated protein Cas1 n=1 Tax=Leptotrichia buccalis C-1013-b RepID=C7NA04_LEPBD Length = 323 Score = 83.9 bits (206), Expect = 6e-15, Method: Composition-based stats. Identities = 39/264 (14%), Positives = 85/264 (32%), Gaps = 49/264 (18%) Query: 18 IFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 +++ G + I + + + + ++L G ++ ++LA +V Sbjct: 3 LYITDLGTVVKKRDDLFEITTSEKKVAVAPQKIKSLVLSKGIFLTTDVIKLAVDNNIDIV 62 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM----------------- 119 V + G Q + + + +++ +++ Sbjct: 63 IVDDFGNPYGRFWQSKFGSTANIRRKQLEIFGTQKGIELAKQILIQKIKNCAEHLEDLKI 122 Query: 120 --------------------FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--G 157 ++++ E + + L G EG+ + Y L++ G Sbjct: 123 KREAKKAFLDKQIKEMKRYIYQIKLVEGNVSEK-RGTLMGYEGNAAKIYYQTLSELIPEG 181 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSF 215 + R P + D N ++ A LY E A + AG P +G +HT S Sbjct: 182 FKFEKRSMHPAE----DEFNAMLNYAFGILYSKVEKACIIAGLDPYVGIIHTDNYGKKSL 237 Query: 216 VYDIADIIK--FDTVVPKAFEIAR 237 V+D+ + + V F R Sbjct: 238 VFDLIESYRHLASRTVFSLFTQKR 261 >UniRef50_UPI0001C16754 protein of unknown function DUF48 n=1 Tax=Cylindrospermopsis raciborskii CS-505 RepID=UPI0001C16754 Length = 334 Score = 83.5 bits (205), Expect = 8e-15, Method: Composition-based stats. Identities = 47/280 (16%), Positives = 96/280 (34%), Gaps = 37/280 (13%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + ID + T +P+ + +++ ++ A+++ Q + ++ E G R Sbjct: 13 VRQIDERLEIFKHDKRLTDVPLCKLESVVVYGRVILTIPALKILNQRSIPITYLSEEG-R 71 Query: 85 VYASGQPGGARSDKLL-YQAKLALDEDLRLKVVRKMFE----------LRFGEPAPA--- 130 A+ P + L Q + + D L + R++ RF Sbjct: 72 TVATLLPEPNPNAILRSKQYQASFDTHKTLAIAREIIRGKLFNQHTILARFSRQNERSEK 131 Query: 131 -----------------RRSVEQLRGIEGSRVRATYALLAKQ-YGVTWNGRRYDPKDWEK 172 S+ +LRG EG + + +L + G W+ + Sbjct: 132 VISALKSLKACQKSIDQTTSLNELRGYEGQGASSYFGVLGELLTGTPWSFSHRTRRPP-- 189 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDTVVP 230 D +N + + LYG AA+ G P GF+H + + D+ + + V Sbjct: 190 TDPVNALLGFGYALLYGDCRAALHTVGLDPYQGFLHGERYGRANLALDLMEEFRPIFVDG 249 Query: 231 KAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 ++ R N E + + + + + + K I E Sbjct: 250 LVLQLLRNNSLEKESFINYPGGAVHLNEQGMKKFIQSYEQ 289 >UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacterized protein, predicted to be involved in DNA repair) n=1 Tax=Desulfobacterium autotrophicum HRM2 RepID=C0QHV1_DESAH Length = 338 Score = 83.5 bits (205), Expect = 8e-15, Method: Composition-based stats. Identities = 28/95 (29%), Positives = 43/95 (45%), Gaps = 8/95 (8%) Query: 134 VEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 + G+EG+ RA + +LA+ + GR P D N ++ LYG Sbjct: 157 RNTIMGLEGTSARAYFKVLARAMPEKYRFKGRSRRPAK----DPFNAVLNYCYGMLYGKV 212 Query: 192 EAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK 224 E A + AG P IGF+HT S V+D+ + + Sbjct: 213 EKACIIAGLDPFIGFLHTDNYNKKSLVFDLIEPFR 247 >UniRef50_B7KMR5 CRISPR-associated protein Cas1 n=3 Tax=Chroococcales RepID=B7KMR5_CYAP7 Length = 558 Score = 82.4 bits (202), Expect = 2e-14, Method: Composition-based stats. Identities = 62/310 (20%), Positives = 107/310 (34%), Gaps = 48/310 (15%) Query: 3 WLPLNPIPLKDRVSMIF-LQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 W P+ P+ D +I L+ G ++ + I +G V+ ++L +++ Sbjct: 210 WHPVRLFPVDDEREVIHVLEPGTRVGRTGEQLKISRPNQPDEKIAIGQVSQVVLHSFSQI 269 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 S AV A + +V G R S + + Q + D L++ RK+ Sbjct: 270 STQAVHFLAYKEVGIHFVSGGG-RYIGSIDARSRSIQRRVRQYQALSQPDFCLELARKLV 328 Query: 121 ELRFGE--------------------------------PAPARRSVEQLRGIEGSRVRAT 148 R GE P +S++ L GIEG+ Sbjct: 329 ACR-GEGQRKFLMRGKRNKKGDSLALEKTIAQMKAVLKQVPQIQSLDSLLGIEGNLAALY 387 Query: 149 YALLAKQY------GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAP 202 + L+ + ++GR P D N +S S L AILA G P Sbjct: 388 FGALSNLLAENAPESLLFSGRNRRPPK----DRFNALLSFGYSLLIKDVMNAILAVGLEP 443 Query: 203 AIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKT 260 A+GF H + + D+ +I + V R+ + + + + ++ S Sbjct: 444 ALGFYHQPRTQAPPLALDLMEIFRVPLVDMPVVTSINRSQWDIQADFDVRGQQVWLSDSG 503 Query: 261 LAKLIPLIED 270 K I L E Sbjct: 504 RRKFINLYEQ 513 >UniRef50_D0MKV7 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MKV7_RHOM4 Length = 553 Score = 82.0 bits (201), Expect = 2e-14, Method: Composition-based stats. Identities = 42/301 (13%), Positives = 101/301 (33%), Gaps = 36/301 (11%) Query: 6 LNPIPLK-DRVSMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 +P + DR+ + L G + ++ + H + ++ + + +++ Sbjct: 216 RRLVPSRNDRLPLYVLSQGSVVKRKGAQLLVQTQDDKDQHFRLIDLSRVSIFGNVQITTQ 275 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE-- 121 A+R + + + AG + + + Q ++A DE L + R + Sbjct: 276 AIRALVEHNIPVFFHSYAGRMLARLVSMYDVNAPVRVAQFEVAADETKSLAIARAIVTGK 335 Query: 122 -----------------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAK--QY 156 R A S+++L GIEG+ R + ++ ++ Sbjct: 336 IKNQRTLLRRNQRTRSERVLRELSRLAREARRASSLDELLGIEGAAARLYFRQFSRMLRH 395 Query: 157 GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPL 213 + ++ + + + + D +N +S + L A++A G P G H G+P Sbjct: 396 RIAFDFKNRNRRPPK--DPVNAMLSFLYALLLKDAMCALMATGLDPYRGIFHQMRFGRP- 452 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 S D+ + + + + + + + K+I E + Sbjct: 453 SLALDLMEEFRPLIADSVVLRLVNTGAV-TEADFIVRGPACAMKKSAMEKVIEAYEQRMN 511 Query: 274 A 274 Sbjct: 512 T 512 >UniRef50_A2BKJ8 Universally conserved protein n=1 Tax=Hyperthermus butylicus DSM 5456 RepID=A2BKJ8_HYPBU Length = 331 Score = 80.4 bits (197), Expect = 7e-14, Method: Composition-based stats. Identities = 58/279 (20%), Positives = 92/279 (32%), Gaps = 51/279 (18%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGEAG 82 +I V G V + + ++L G+ VS A+R A++G LV +G+ G Sbjct: 13 RIYVRRG-VVYAEAPSGEKAVVTADTELVVLATGSVSVSGRALRRLAELGVRLVVLGQRG 71 Query: 83 ----------------------VRVYASGQPGGARSDKLLY----QAKL--ALDEDLR-- 112 RV A+G+ ++ + QAKL L + R Sbjct: 72 QVVAEHRPVDRVNRTIEARMEQYRVKATGEALYYAAEMVYAKIVNQAKLLRYLAKSRREP 131 Query: 113 ------LKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 +V LR + E +R IE R + +A+ + GR Sbjct: 132 WLRDAGYRVEGHADRLRQIIENEEPTTPEVIRSIEAQAARDYWDAIAQIAPTPFPGR--- 188 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADII- 223 D +N +S + LY + A+ AG P GF+H + S YD AD Sbjct: 189 --QPRGEDHLNMALSYGYAILYSIAHDALTVAGLDPYAGFLHADRSGRPSLTYDYADTYK 246 Query: 224 -----KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 K P+ + G R + Sbjct: 247 PIAVDKPLLTAPRKTDCLDTYMGALTYNARRCIATLVLE 285 >UniRef50_A6UVX9 CRISPR-associated protein Cas1 n=2 Tax=Methanococcales RepID=A6UVX9_META3 Length = 334 Score = 79.7 bits (195), Expect = 1e-13, Method: Composition-based stats. Identities = 22/112 (19%), Positives = 42/112 (37%), Gaps = 8/112 (7%) Query: 120 FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTIN 177 + + P R + + GIEG + + + + R P D N Sbjct: 150 YTNNIDDKLPHREIKDTIIGIEGIASKYYFEGINHALPKNYKFKERSRRPAR----DKFN 205 Query: 178 QCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDT 227 ++ LY + E ++ AG P +GF+H + VYD+ ++ + Sbjct: 206 ALLNYGYGMLYPMVEKCLIVAGLDPYVGFIHADNYNKTTLVYDVIEMYRAHI 257 >UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiothermus RepID=C1XN81_MEIRU Length = 323 Score = 79.3 bits (194), Expect = 2e-13, Method: Composition-based stats. Identities = 55/291 (18%), Positives = 102/291 (35%), Gaps = 32/291 (10%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + G + +P V +++ R++ A+ + G +++ G + Sbjct: 12 LRLSQGRLRVELDEQTLAELPARKVRGVVVWGNVRLTTPALAFLLRQGVPVLYATLEG-Q 70 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------------------- 125 +Y Q + + ++L LA L L + +LR G Sbjct: 71 LYGQAQAPQSLAPEVLRAQLLAQQNPLPLAQGFLLGKLRSGQMLLERLARQAPITPQQAE 130 Query: 126 -----EPAPARRSVEQLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEKGDTINQC 179 E P RS+E LRGIEG+ RA +A L ++GR P D +N Sbjct: 131 IEAALEALPQARSLEALRGIEGNAARAYFAGLQAVLAPYGFSGRNRRPP----TDAVNAA 186 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIAR 237 +S L G A+ AG P +G +HT + +D+ + + V Sbjct: 187 LSYGYMVLLGRVLLALGIAGLHPELGLLHTEGRRVPALAFDLMEEFRVSVVDAVVIAAFL 246 Query: 238 RNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQP 288 R+ P + ++ + L+ +E+ + P + Q Sbjct: 247 RSELTPQQHSEARNGGVYLNEAGRKALLRRLEERFSQEAQHPKGFRKPYQE 297 >UniRef50_A8UXX8 Putative uncharacterized protein n=1 Tax=Hydrogenivirga sp. 128-5-R1-1 RepID=A8UXX8_9AQUI Length = 290 Score = 78.9 bits (193), Expect = 2e-13, Method: Composition-based stats. Identities = 45/272 (16%), Positives = 104/272 (38%), Gaps = 26/272 (9%) Query: 13 DRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 R+ +++ ++ + + I + +PV V I+ G +S A+ L Q Sbjct: 2 KRIVVVY-SSARVSRSGERVKISTFS-INSSLPVRYVEAIVAFGGLELSSHALSLLMQNN 59 Query: 73 TLLVWVGEAG-VRVYASGQPGGARSDKLLYQAKLALDE---------DLRLKVVRKMFEL 122 + ++ + G ++ + + + + Q + + + +++ + + F L Sbjct: 60 VPVFFLTKLGALKAVLWTKILSSNTSNRIRQYEKYVRDPFGVAKEIVRAKIRTIEREFGL 119 Query: 123 RFGEPA---PARRSVEQLRGIEGSRVRATYALLAKQ---YGVTWNGRRYDPKDWEKGDTI 176 + + E+L GIEG+ R + ++ G ++ R Y P D + Sbjct: 120 KLNNLISSLERAGTKEELLGIEGTASRLMFERFSQNIELSGFSFRERAYHPP----PDPV 175 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFE 234 N +S + + Y ++ GY P I F+HT G L+ DI + ++ Sbjct: 176 NALLSLSYTFTYALSLPLTTLMGYDPYISFLHTRSGSHLALCSDIMEPVRPVLTKRLEEP 235 Query: 235 IARRNPGEPDREVRLACRDIFRSSKTLAKLIP 266 I RR + ++ + +++ K + Sbjct: 236 ILRRVFTK--KDFNRERAACYLKKESMPKFLN 265 >UniRef50_UPI00016C522C CRISPR-associated protein Cas1/Cas4 n=1 Tax=Gemmata obscuriglobus UQM 2246 RepID=UPI00016C522C Length = 571 Score = 78.1 bits (191), Expect = 4e-13, Method: Composition-based stats. Identities = 45/276 (16%), Positives = 85/276 (30%), Gaps = 38/276 (13%) Query: 7 NPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 IP+ D ++++LQ + V+ + T +P+ ++ +++ +VS A Sbjct: 226 KVIPMTDDGAVLYLQEPGTSVGKRSEHLVVKKEGQELTRVPMHAIRQVVVCGNVQVSTQA 285 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK------ 118 + A + +V G + + Q + D L + + Sbjct: 286 LETLAANDIAVAYVTGHGRFIGSFVPAPAKNVSLREAQFRTFNDPSACLDLAKAVVRAKL 345 Query: 119 -------MFELR------------------FGEPAPARRSVEQLRGIEGSRVRATYALLA 153 M LR A+ SVE + GIEG + Sbjct: 346 SNQRALLMRSLRGEGEARGSHEYSAKGIYGLLGALDAQTSVESVLGIEGQGAALYFGDFG 405 Query: 154 KQYGVTWNGRRYD---PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 + G+ +D D +N +S A + L + G+ P GF H G Sbjct: 406 RFLKQPPTGKGFDFTTRNRRPPRDPVNALLSFAYAMLAKDCFSVACTVGFDPYKGFFHVG 465 Query: 211 K--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 + S D+ + + + PD Sbjct: 466 RHGKPSLALDLMEEFRPVIADSVVLTLINNEALTPD 501 >UniRef50_B9K7F7 CRISPR-associated protein, Cas1 family n=1 Tax=Thermotoga neapolitana DSM 4359 RepID=B9K7F7_THENN Length = 329 Score = 78.1 bits (191), Expect = 4e-13, Method: Composition-based stats. Identities = 51/286 (17%), Positives = 95/286 (33%), Gaps = 35/286 (12%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGI---RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + G++ + ++ + R IPV ++ I ++ + AA+ Sbjct: 7 NYYVFSSGRVKRHENTILIEYQKAGMQQRKFIPVENIDQIFFLGEVDLNSKFLDFAAKNN 66 Query: 73 TLLVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE---LRFGE 126 +L + G Y + L+ Q + LD + RL + RK E F Sbjct: 67 IVLHFFNYYG--YYTGSFYPREKFISGELLVRQVEHYLDSEKRLTLARKFVEGAVHNFKR 124 Query: 127 PAPAR------------------RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPK 168 R +++ +L E + Y+ + G + R P Sbjct: 125 NIEKRGFDITDKISEYLERTKYAKTIPELMSCEAHARKLYYSTWEEITGWPFEERSMQPP 184 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFD 226 E +N IS S Y + + P I ++H K S DIA+I K Sbjct: 185 LNE----LNALISFGNSLTYSIVLKELYFTHLNPTISYLHEPGTKRFSLALDIAEIFKPI 240 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 V F++ + ++ R +F + + I E++L Sbjct: 241 FVDRIIFKLINLKKIDREKHFLQEARGVFLNEEGRRLFIEEFENML 286 >UniRef50_A9GDF7 Putative uncharacterized protein n=1 Tax=Sorangium cellulosum 'So ce 56' RepID=A9GDF7_SORC5 Length = 365 Score = 77.0 bits (188), Expect = 8e-13, Method: Composition-based stats. Identities = 61/280 (21%), Positives = 101/280 (36%), Gaps = 35/280 (12%) Query: 10 PLKDRVSMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 P +R S+ + +G +I AF + ++ G RT I V+ I++ +++ A+RL Sbjct: 25 PDTERRSLHVVSHGARIGRASDAFEVTEREGERTRIGAREVSDIVVHGHAQITTQALRLC 84 Query: 69 AQVGTLLVWVGEAG--VRVYASGQPGGARS----------------DKLLYQAKLALDED 110 A + +VG G V V++ G G R + L AK+ + Sbjct: 85 AAEEIAVHFVGAGGAHVGVFSGGSSGVQRRIRQFRGLTDDVFAIGLARRLVMAKIEMQLR 144 Query: 111 LRLKVVRKMFELRFGEPAP------------ARRSVEQLRGIEGSRVRATYALLAKQYGV 158 L+ RK LR G P L G EG+ R + LA Sbjct: 145 HVLRASRKDEALRAGLDEPIESLRGGLRRAAKAADRTSLMGQEGNAARGYFGALAALVHP 204 Query: 159 TW--NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSF- 215 R D N +S + LY +AIL G P G +H + +F Sbjct: 205 DAGDALRPRGRSRRPPEDRFNALLSFGYTLLYRDVLSAILRVGLEPGFGVLHQPRSAAFP 264 Query: 216 -VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDI 254 D+ ++ + V R + +R+ + R + Sbjct: 265 LALDLTELFRVPVVDMAVLGAVNRRTFDAERDFVITGRQV 304 >UniRef50_C1XWQ6 CRISPR-associated protein, Cas1 family n=3 Tax=Thermaceae RepID=C1XWQ6_9DEIN Length = 326 Score = 77.0 bits (188), Expect = 8e-13, Method: Composition-based stats. Identities = 48/282 (17%), Positives = 94/282 (33%), Gaps = 47/282 (16%) Query: 15 VSMIF-LQYGQIDVIDGAFVLIDKT--GIRTHIPVGSVAC---IMLEPGTRVSHAAVRLA 68 + +++ L+ G + G + + G T + + ++L V+ A ++ Sbjct: 1 MGVVYVLEDGYLAKDGGTLKVSKRGPGGGETLLEKPLIGVEEIVVLGNAV-VTPALLKHC 59 Query: 69 AQVGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------- 119 A+ L +V G R +A + + + Q LD +L + R+ Sbjct: 60 AEENIGLHYVSTGG-RYFAGLTRTPAKNAPARVAQFAAHLDPTRKLALARRFVLGKLRNS 118 Query: 120 ------------FELRFG-EPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRR 164 E+R+ LRG+EG+ + A G ++ R Sbjct: 119 LTLLRRNGAEGWEEIRWAIGELDKAADEGALRGLEGNAADVYFRSYAALLPEGFRFSERS 178 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADI 222 P D N +S A + L E+A+ AG P +G++H + S D+ + Sbjct: 179 RRPPR----DPANSLLSLAYTFLAKECESALQVAGLDPYVGYLHEVRYGRASLALDLMEE 234 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKL 264 + + R + F ++ KL Sbjct: 235 FRSILADSVVLSLLNN---------RRLTLEDFDDAEGYPKL 267 >UniRef50_A1A2M8 CRISPR-associated DNA polymerase n=14 Tax=Bacteria RepID=A1A2M8_BIFAA Length = 343 Score = 76.6 bits (187), Expect = 9e-13, Method: Composition-based stats. Identities = 46/269 (17%), Positives = 86/269 (31%), Gaps = 46/269 (17%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + + + + + V+ +P+ S+ IM S A + ++G + Sbjct: 9 FVMTEDAYLALENDNVVIYQNDQTLAKVPLRSIEGIMCFSYKGASPALMGRCGKLGVSMA 68 Query: 77 WVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKM---------------- 119 + G Y S R+ L Q ++A DE L+ + Sbjct: 69 FYSPRG-HYYCSVLGEENRNVLLRREQFRVADDEQKSLRYAKSFIVGKLYNAKWVLERTK 127 Query: 120 --FELRFGEPAPA---------------RRSVEQLRGIEGSRVRATY-----ALLAKQYG 157 LR A ++++LRG+EG + + +L + Sbjct: 128 RDHALRVNIDRLAEQSGKLSAALSKARKSLTIDELRGVEGLAAKDYFYAFDDLVLKNKDD 187 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 + R P D +N +S S L AA+ G P +GF+HT +P S Sbjct: 188 FFFTSRSRRPP----LDRLNALLSFCYSILTNDCIAALQGVGLDPYVGFMHTDRPGRASL 243 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPD 244 D+ + + + +P Sbjct: 244 ALDLVEEFRPVLADRFVLTLVNTGAVKPG 272 >UniRef50_D0YU98 Crispr-associated protein Cas1 n=1 Tax=Mobiluncus mulieris 28-1 RepID=D0YU98_9ACTO Length = 309 Score = 76.6 bits (187), Expect = 1e-12, Method: Composition-based stats. Identities = 45/228 (19%), Positives = 80/228 (35%), Gaps = 30/228 (13%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 G + +GA V+ + G +P+ VA ++ TR S +V + T +++ Sbjct: 18 RGFVTSTEGALVVRPEEGEERRVPISDVAVVLFGVDTRFSAGSVHRILKNDTAVIFCDWK 77 Query: 82 GVRVYASGQPGGARSDKLLYQ-AKLALDEDLRLKV-VRKMFELRFGEP-------APARR 132 GV Y P G + Q A+ R V R + G+ P + Sbjct: 78 GVP-YGHAYPWGDHTRVGARQIAQANASIPARKSVWARLIKSKVLGQAEVLEFFGRPNGK 136 Query: 133 SVEQL---------RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT--INQCIS 181 +++L IEG R + L W+ + + GD IN + Sbjct: 137 RLKELVKDIRSGDPSNIEGQAARIYWESL-------WDDQDFRRTPGAGGDIFTINAMLD 189 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDT 227 + L G A+ A+G ++G H G+ + D + + Sbjct: 190 YGYTILRGHAMRAVAASGLISSLGVAHRGRSNPWNLADDFIEPFRPAI 237 >UniRef50_C3MX12 CRISPR-associated protein Cas1 n=11 Tax=Sulfolobaceae RepID=C3MX12_SULIM Length = 304 Score = 76.2 bits (186), Expect = 1e-12, Method: Composition-based stats. Identities = 47/298 (15%), Positives = 102/298 (34%), Gaps = 50/298 (16%) Query: 6 LNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAV 65 + + + + + I+++ + + G + I V I++ +S +A+ Sbjct: 1 MRTLVISEYGAYIYVKKNMLVIKKG--------DNKVEISPSEVDEILITASCSISTSAL 52 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG 125 LA G ++++ G + + + K V +K ++R+G Sbjct: 53 SLALTHGISVMFLNSRDTP---WGILLPSVITETVKTKKA----QYETIVAKK--DIRYG 103 Query: 126 EPAPARRSVEQ-------------------LRGI-EGSRVRATYALLAKQYGVTWNGRRY 165 E + + Q L G E + R + +++ + + Sbjct: 104 EEIISSKIYNQSVHLKYWTRLTGTRNDYKELLGKDEPTAARIYWRNISQ---LLPKDIGF 160 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADII 223 D +D + D N ++ + + LY ++ AG P +GF+H +P S VYD +++ Sbjct: 161 DGRDVDGVDQFNMALNYSYAILYNTIFKYLVIAGLDPYLGFIHKDRPGNESLVYDFSEMF 220 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 K + R RL +D + L LI + + Sbjct: 221 KP-YIDFLLVRALRSG-------FRLKVKDGLIEENSRGDLAKLIRKGMEEKVKEESD 270 >UniRef50_C4G3M4 Putative uncharacterized protein n=1 Tax=Abiotrophia defectiva ATCC 49176 RepID=C4G3M4_ABIDE Length = 340 Score = 76.2 bits (186), Expect = 1e-12, Method: Composition-based stats. Identities = 54/304 (17%), Positives = 109/304 (35%), Gaps = 40/304 (13%) Query: 16 SMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 + ++ G +I I G F+L K G +P + I + + ++ A++ + Sbjct: 9 CLYVVEQGSKIKHIGGQFILEVKDGENRVVPDEILESISIFGNSVLTTQAIKACLEKNIN 68 Query: 75 LVWVGEAGVRVYASGQPGGARS-DKLLYQAKLALDEDLRLKVVR--------------KM 119 + ++ G R + A + D+L QA L+ + D LK + + Sbjct: 69 VSFLSTKG-RYFGKLMSNTATNPDRLKAQAYLSDNIDECLKFAKIILKAKINNQDVILRR 127 Query: 120 FELR--------------FGEPAPARRSVEQLRGIEGSRVRATYALLAK--QYGVTWNGR 163 + + E + + ++ G EG R + L+K + ++GR Sbjct: 128 YAKSSEADISSHIKDLKIYEEHIEKGKDINKIMGYEGIAARTYFEALSKLIKPEFKFSGR 187 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIAD 221 P D N +S S +Y + I +P IGF+H K + V D+ + Sbjct: 188 NKRPPK----DAFNSMLSLGYSLIYNEIFSEIENRNLSPYIGFIHKLKDRHPALVSDLIE 243 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDI-FRSSKTLAKLIPLIEDVLAAGEIQPP 280 + V + + N + + + F S + +++ IE+ L + Sbjct: 244 EWRAVLVDATMMSLIQGNEILIEEFTKDEYSEAVFISDLAVKQIVRKIENKLRSQNNYLE 303 Query: 281 APPE 284 E Sbjct: 304 YLNE 307 >UniRef50_C9M4E6 CRISPR-associated protein cas1 n=1 Tax=Lactobacillus helveticus DSM 20075 RepID=C9M4E6_LACHE Length = 332 Score = 75.4 bits (184), Expect = 2e-12, Method: Composition-based stats. Identities = 48/291 (16%), Positives = 91/291 (31%), Gaps = 41/291 (14%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 G+++ D L G + + I L + + AQ+ + Sbjct: 7 YYLFSSGELERKDNTVRLTRSDGKYKDLKIEVTRDIYLFGEVSTNTKCLNYLAQMKIPVH 66 Query: 77 WVGEAGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVVRKM-------------- 119 + G Y L+ Q + D RL +K Sbjct: 67 FFNYYG--FYTGSFYPKEQNVSGTLLIQQVQAYTDPKRRLYYAKKFVLGAAKNLLRNLKY 124 Query: 120 FELR---FGEPAPARRSV----------EQLRGIEGSRVRATYALLAKQY--GVTWNGRR 164 ++ R + S+ E+L G+EG+ R YA + V ++ R Sbjct: 125 YQRRGRNLDDSIKEITSLIRQIDQVHDVEELMGVEGTIHRRYYASWQSVFIPDVDFSKRV 184 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADI 222 P D + +N IS +Y + I P I ++H+ + S DI++I Sbjct: 185 RRPPD----NMVNTLISFLNGLMYTTCLSEIYVTQLNPTISYLHSPMDRRFSLCLDISEI 240 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 K V F + +N + + + S K+ ++ + + Sbjct: 241 FKPMIVDRLIFSLINKNMIS-EEDFNKESNYCYLSEKSKRIIVSEYDKYMK 290 >UniRef50_A6UNF5 CRISPR-associated protein Cas1 n=1 Tax=Methanococcus vannielii SB RepID=A6UNF5_METVS Length = 322 Score = 75.4 bits (184), Expect = 2e-12, Method: Composition-based stats. Identities = 26/95 (27%), Positives = 43/95 (45%), Gaps = 8/95 (8%) Query: 134 VEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 +Q++G+EGS + + +L+K +NGR P D N ++ LY Sbjct: 155 RDQIQGLEGSVSKIYFRVLSKSLPKKYQFNGRSRKPAK----DYFNCMLNYGYGMLYSEI 210 Query: 192 EAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK 224 E + +G P IG +HT SFV+D + + Sbjct: 211 EKICIISGIDPTIGILHTDGQNRKSFVFDYIEKYR 245 >UniRef50_A5UXM3 CRISPR-associated protein, Cas1 family n=4 Tax=Chloroflexaceae RepID=A5UXM3_ROSS1 Length = 332 Score = 75.4 bits (184), Expect = 2e-12, Method: Composition-based stats. Identities = 45/257 (17%), Positives = 90/257 (35%), Gaps = 39/257 (15%) Query: 18 IFL-QYGQIDVI-DGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTL 74 +++ + G + D ++ + +P+ + +++ G ++S A + + G Sbjct: 4 LYIQEQGVMVRKRDNQVLITKDGQTLSEVPLAKIDQVVLMGRGVQLSTALLIDLLERGIP 63 Query: 75 LVWVGEAGVRVYASGQPGGAR-SDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA----- 128 + + + G R YA+ G +R D + Q + D L++ + + + Sbjct: 64 VTFTNQHGSRHYATLTAGPSRFGDLRIRQMQFVGAPDRALRLAKDIVSAKLTNQRRLLAA 123 Query: 129 ---PARRS-----------------VEQLRGIEGSRVRATYALLAKQYGVTWN--GRRYD 166 PA + V+ LRG EG+ A + W GR + Sbjct: 124 TGWPAAATAIAQIDAALTAAANAPHVDMLRGHEGAAAAAYFGAWRASLPPVWGFGGRAFY 183 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADII 223 P D IN +S + A+ G +G H G+P S D+ + Sbjct: 184 PP----PDPINAMLSFGYTLALHDVITAVQITGLDTYLGVFHVIEPGRP-SLALDLLEEF 238 Query: 224 KFDTVVPKAFEIARRNP 240 + V ++ R N Sbjct: 239 RPLIVDRLVIDLVRTNA 255 >UniRef50_B7IHY4 Cas crispr-associated protein Cas1 n=4 Tax=Bacteria RepID=B7IHY4_THEAB Length = 330 Score = 75.0 bits (183), Expect = 3e-12, Method: Composition-based stats. Identities = 51/282 (18%), Positives = 96/282 (34%), Gaps = 45/282 (15%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G++ D G + +P+ ++ + + ++ + A++ G L+ + G Sbjct: 11 GELKRKDNTICFESSEGKKY-LPIENINNLWIFGEVELNKRFLDFASENGILIHFFNFYG 69 Query: 83 VRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------------FELR-- 123 Y + +L QA+ LD + R+K+ RK ++ R Sbjct: 70 --YYTGTFYPREHLNSGFVILKQAEHYLDNEKRIKLARKFVEGAVENLLVVLKYYKNRGY 127 Query: 124 -FGEPAP----------ARRSVEQLRGIEGSRVRATY----ALLAKQYGVTWNGRRYDPK 168 + + + VE L EG+ +R Y + K+ + R P Sbjct: 128 ELEDEIDDIKNKKEGIYSSQDVETLMSFEGN-IRDIYYKCFDKITKKEEFAFEKRSRRPP 186 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFV--YDIADIIKFD 226 + +N IS S LY I P IG++H+ F D+A+I K Sbjct: 187 ----LNKMNSLISFGNSLLYTTVLGEIYQTQLDPRIGYLHSTNNRRFTLNLDVAEIFKPS 242 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 V F + +N ++ I ++ K I Sbjct: 243 IVDRVIFSLVNKNV-LSSKDFDKQLNGIVLNNSGKKKFIEEY 283 >UniRef50_B7C8S2 Putative uncharacterized protein n=2 Tax=Eubacterium biforme DSM 3989 RepID=B7C8S2_9FIRM Length = 329 Score = 75.0 bits (183), Expect = 3e-12, Method: Composition-based stats. Identities = 46/258 (17%), Positives = 87/258 (33%), Gaps = 38/258 (14%) Query: 13 DRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +V ++ + G + D + VL K IP+ V I+ ++ + L + Sbjct: 2 KKVVYLY-KSGNLKRKDNSLVLESKDKDDY-IPIEQVDMIICFSEVSLNKRVLALLNKYE 59 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR--------------K 118 L+++ G + L+ Q DE RL + + K Sbjct: 60 VLILFYNFYGNYIGRYAPKDYKDGRVLVNQVNKYRDESQRLYISKSILKASIKNMLSVLK 119 Query: 119 MFELRFGEPAPARRSVEQLRGI-------------EGSRVRATYALLAKQYG---VTWNG 162 + + R +E L G+ E + + Y + + Sbjct: 120 YYRKKGKNLDELIRKLEDLVGMASDIETMNELMLIEANAKQTYYKMFDVVLENEEFKFQK 179 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIA 220 R +P E +N +S + LYG+ + + + P I F+H + S YD+A Sbjct: 180 RTKNPPQNE----VNAMLSYGYTLLYGIILSILDRSSLFPQISFIHSLSKNSDSLQYDLA 235 Query: 221 DIIKFDTVVPKAFEIARR 238 DI K + + R+ Sbjct: 236 DIFKPVYIDRMVLRLIRK 253 >UniRef50_A8ABK8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccus hospitalis KIN4/I RepID=A8ABK8_IGNH4 Length = 305 Score = 75.0 bits (183), Expect = 3e-12, Method: Composition-based stats. Identities = 55/257 (21%), Positives = 91/257 (35%), Gaps = 48/257 (18%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + + G++ V DG + G ++ I+ V+ AA+R ++G LV Sbjct: 4 FVISEPGKLFVKDGGLAFANSKGEVAYLANLYDVIILATSKVSVTGAALRAMGRLGVDLV 63 Query: 77 ---WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------------- 119 W G R S + L Q ++ L + LK + M Sbjct: 64 VLEWNGRPSGRF--SSPVPNKSALARLKQYEVVL-KGEGLKYAKPMIVRKIIEQGRTLRY 120 Query: 120 --------------FEL-RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRR 164 +EL + A + S E LR +E R + LL++ + + GR Sbjct: 121 FAKTKRMKWLREASYELEKLSADASSAGSPEALRAVEAQAARLYWGLLSEAFP-EFPGRE 179 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH---TGKPLSFVYDIAD 221 E D N I+ + LY A+ G P G H +G+ + VYD ++ Sbjct: 180 -----HEGCDPYNSAINYSYGILYSYAFKALSVTGLDPYAGLFHAIKSGRE-ALVYDFSE 233 Query: 222 IIK---FDTVVPKAFEI 235 K +P A E+ Sbjct: 234 QFKPLVDRRALPLAHEL 250 >UniRef50_A1BI39 CRISPR-associated protein Cas1 n=5 Tax=Chlorobiaceae RepID=A1BI39_CHLPD Length = 731 Score = 75.0 bits (183), Expect = 3e-12, Method: Composition-based stats. Identities = 48/300 (16%), Positives = 100/300 (33%), Gaps = 49/300 (16%) Query: 18 IFL-QYGQIDVIDGAFVLIDKTGIRT-HIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++L + G + DG I+K G + V + +++ ++ ++ Q + Sbjct: 392 LYLQEQGSLMRKDGERFSIEKDGSVINEVIVRRIEQVVIFGNVALTTPVMQYCLQNEIPV 451 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLAL----DEDLRLKVVR-----------KMF 120 ++ + G + + G+ +D Q + DE L+ R M Sbjct: 452 TFLSQHG-KYF--GRLEATTADNAEMQ-RFHFLRSIDEPFALETARSIVAAKISNSKTMI 507 Query: 121 ELR------------------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY 156 R A A ++ LRGIEG + + Sbjct: 508 RRRKTVVQDRDSTLQNKMAYNLDIMADLALKAEASTDIDALRGIEGKASALYFECYGMLF 567 Query: 157 G--VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL- 213 + ++ R + D +N +S + L+ + + A+G P IGF+H + Sbjct: 568 SKNLPFHTRSFLRVRRPPTDPVNSLLSFGYTMLHTNIFSMVQASGLNPYIGFLHAERKGN 627 Query: 214 -SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 + V D+ + + + + R E D R F S+ + + + E + Sbjct: 628 PALVNDLVEEFRTIVDSLVLYTLNRGLLQEKDFYYRKDEPGCFLSNDARKRFLNIFETRM 687 >UniRef50_Q53W21 Putative uncharacterized protein TTHB145 n=3 Tax=Thermus thermophilus RepID=Q53W21_THET8 Length = 315 Score = 73.9 bits (180), Expect = 6e-12, Method: Composition-based stats. Identities = 55/265 (20%), Positives = 88/265 (33%), Gaps = 26/265 (9%) Query: 45 PVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQ-- 102 P V + L R+S A+ + G + + G +G L Q Sbjct: 32 PARQVRSVALWGNVRLSTPALVFLLRQGVPVFFYSLEGFLHGVAGAYPDPHPAHLRAQFA 91 Query: 103 ------AKLALDEDLRLKVVRKMFELRFGEP---------APARRSVEQLRGIEGSRVRA 147 A+ + LR + + R E A +E+LRG EG R Sbjct: 92 AEGLPLARAFVVGKLRSALAL-LERHRLPEAGGVVEALARAEGASELERLRGAEGEGSRV 150 Query: 148 TYALLAKQYG-VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGF 206 + LA+ G + GR P D +N +S + L G A+ AG P +GF Sbjct: 151 YFQGLARLLGPYGFGGRTRRPPR----DPVNAALSYGYALLLGRVLVAVRLAGLHPEVGF 206 Query: 207 VHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKL 264 +H S D+ + + V RR P + ++ + + +L Sbjct: 207 LHAEGRRSPALALDLMEEFRVPVVDQVVLSAFRRGLLTP-SHAEVREGGVYLNEEGRRRL 265 Query: 265 IPLIEDVLAAGEIQPPAPPEDAQPV 289 I L E+ L G P + Sbjct: 266 IQLFEERLLEGVSHPLGFRKPLGET 290 >UniRef50_C2GEC7 Crispr-associated protein Cas1 n=2 Tax=Corynebacterium RepID=C2GEC7_9CORY Length = 533 Score = 73.5 bits (179), Expect = 8e-12, Method: Composition-based stats. Identities = 46/275 (16%), Positives = 82/275 (29%), Gaps = 39/275 (14%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 + +Q + + +G ++ P+ V +++ +S A R +VW Sbjct: 210 LTVQGSRASIRNGRVIVEKAGERLADAPLERVQGVVIHGNVDISSALHRNFLWHNVPVVW 269 Query: 78 VGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------LRFGEP 127 G RVY P + Q + L + R M R G+ Sbjct: 270 CSTTG-RVYGYSCPTDGPNAAARVQQHVLS-SRGCLPIARGMVNAKIMNQATLLRRNGDS 327 Query: 128 A-------------PARRSVEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYD 166 + QL GIEG ++ + Q G TW GR Sbjct: 328 DRTVQLLIDAGTCSLEAKDNRQLFGIEGDAAALYFSTFSTMLNKAQIDQLGWTWTGRHGR 387 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK 224 D IN ++ + L A+L+ G P GF+H+ + D+ + + Sbjct: 388 ----GASDPINILLNYSYGLLRAEVIRALLSCGLDPHAGFLHSSGRNKPALALDLMEEFR 443 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSK 259 + R + + R + Sbjct: 444 APVSDSVVISLINRREIKDTDFTHIHGVARLRDTG 478 >UniRef50_Q65S18 Putative uncharacterized protein n=1 Tax=Mannheimia succiniciproducens MBEL55E RepID=Q65S18_MANSM Length = 333 Score = 73.1 bits (178), Expect = 1e-11, Method: Composition-based stats. Identities = 46/243 (18%), Positives = 90/243 (37%), Gaps = 37/243 (15%) Query: 18 IFLQYGQIDVI-DGAFVLIDKTGIRTH-IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ ++ +G ++ + G R IP+ SV + ++ + + + + G + Sbjct: 4 LYIDRRTTELKVNGDVLICYEKGERIATIPLASVDRLYMKGDINLQISLLSKLGEKGIGV 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM----------FELRFG 125 V++ + + + + Q LA ++ L + + + F +F Sbjct: 64 VFLQGRKNKPMQFLPQPHNDAYRRVTQTYLADNKLFCLTLAKNIVLNKCIKQCQFLAKFI 123 Query: 126 EPAPA-----------------RRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYD 166 E P + +++ LRGIEG +A A + +NGR Sbjct: 124 EHNPKIITFIAELQKLFNLIVKQENIDSLRGIEGRMGAIYFAAFADILPRSLGFNGRNRR 183 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIK 224 P D +N +S + LY A+ AG P IGF HT S D+ + I+ Sbjct: 184 PPK----DPVNAVLSLTYTLLYSEATLAVYGAGLDPYIGFFHTLHFGRKSLSCDLMEPIR 239 Query: 225 FDT 227 Sbjct: 240 PSV 242 >UniRef50_A7GY67 Crispr-associated protein Cas1 n=6 Tax=Campylobacter RepID=A7GY67_CAMC5 Length = 332 Score = 72.7 bits (177), Expect = 1e-11, Method: Composition-based stats. Identities = 62/300 (20%), Positives = 102/300 (34%), Gaps = 44/300 (14%) Query: 12 KDRVSMIFLQYGQIDVIDGAFVL--IDKTGIRTH---IPVGSVACIMLEPGTRVSHAAVR 66 DR I L G++ D D+TG T +P+ ++ I + + + Sbjct: 4 SDRTHFI-LSSGRLRRQDNNIYFDKFDETGGVTASKILPINAIDEIYILTRVELDTYTLA 62 Query: 67 LAAQVGTLL----VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----- 117 A LL + G ++ LL Q + D R+ + R Sbjct: 63 FLADNNILLHVFSPFQSFRGNFYPSTSNSVNKSGFALLSQLRAFDDPVKRVYIAREITRA 122 Query: 118 ---------KMFELRFGEPAPARRSVE------QLRGIEGSRVRATYALL----AKQYGV 158 K ++F +PAP +++ Q+ EG+ + Y A Q Sbjct: 123 HMLNDAANCKKHGVKF-DPAPHIAALDAAADVGQIMAAEGAFQKLYYEKWNEIIADQRSF 181 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFV 216 + R P D IN IS + +Y V + I P IGF+H + LS Sbjct: 182 KFTVRSKRPP----ADKINSFISYVNTRIYNVCLSEIYKTELDPRIGFLHEPNYRALSLH 237 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS---SKTLAKLIPLIEDVLA 273 D+A+I K F + + A R F + K K+I + + +A Sbjct: 238 LDLAEIFKPILGDTLIFAMLNKKEITAKDFQTDAGRIKFSNDAIQKIEMKMISRLSETIA 297 >UniRef50_B0TFX3 Crispr-associated protein cas1 n=4 Tax=Clostridia RepID=B0TFX3_HELMI Length = 364 Score = 72.0 bits (175), Expect = 2e-11, Method: Composition-based stats. Identities = 38/248 (15%), Positives = 82/248 (33%), Gaps = 42/248 (16%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 + G V+ +K + T +P V ++ G +S A+ + G + + G Sbjct: 41 LGKKSGRLVVREKGQVVTEVPFDRVEQVTVITSGASLSTDAIEECVRHGIEINLLDFRGS 100 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA------------- 130 PG + K + LA ++ + + + + Sbjct: 101 PYAKLFAPGLTATVKTRREQLLAFNDQRSIFLAKAFVRGKIQNQINTLKYFAKYRKSARQ 160 Query: 131 -------------RRSVEQLRGIEGSRVRA----TYALLAKQYGVTWNGRRY-------- 165 +++++L GI+G + A ++ + W+ + Sbjct: 161 EVYAYLQDAALLMEKNLQELTGIDGLNIDAVRGPLMSVEGRAATRYWDAVAFLLKGYVVF 220 Query: 166 -DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 ++ D +N ++ + L A+L AG P GF+H +P S VYD + Sbjct: 221 PGRENRGATDPVNSLLNYGYAVLEARVLGAVLQAGLDPYAGFLHVDRPGKTSLVYDFIEE 280 Query: 223 IKFDTVVP 230 + + Sbjct: 281 FRQPVIDR 288 >UniRef50_A7HMV0 CRISPR-associated protein Cas1 n=3 Tax=Thermotogaceae RepID=A7HMV0_FERNB Length = 329 Score = 72.0 bits (175), Expect = 3e-11, Method: Composition-based stats. Identities = 53/299 (17%), Positives = 105/299 (35%), Gaps = 54/299 (18%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 L+ + S ++++ + ++P+ +++ IM+ ++ + L + Sbjct: 12 LRKKSSALYIE------------PKSEDQKPIYVPLKNISSIMVFSEIEMNKKTLELFSY 59 Query: 71 VGTLLVWV---GEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR---------- 117 + + GE Y + LL Q + D + R+ + R Sbjct: 60 SQVPVFFFNYNGEYIGCFY--PVEENKTGEMLLLQLQHYQDLEKRIIIAREILYGVADNI 117 Query: 118 ----KMFELRFGEPAPARRSVE-------------QLRGIEGSRVRATYALLAKQYGVTW 160 K ++ F E +E L IEG+ + Y L GV + Sbjct: 118 LKILKSYKNDFPEINEKIEMIEKLKKTYHRQEEIASLMAIEGNIRKNYYEAL----GVIF 173 Query: 161 NGRRYDPKDWEK---GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 + + + ++ D IN IS + LY V + I PAI F+H S Sbjct: 174 SKKDFTFQERTARPPADEINAMISFGNTILYNVVLSEIFKTSLEPAISFLHEPNKRKFSL 233 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 DIA+I K V + R+ + D + + ++ + K + +E+ L+ Sbjct: 234 QLDIAEIFKPIIVDRTILTLVNRSMIKKD-DFKKVEGGVYLNETGKKKFVKALEEKLSE 291 >UniRef50_C1DUM1 Crispr-associated protein Cas1 n=18 Tax=Bacteria RepID=C1DUM1_SULAA Length = 331 Score = 72.0 bits (175), Expect = 3e-11, Method: Composition-based stats. Identities = 45/254 (17%), Positives = 82/254 (32%), Gaps = 35/254 (13%) Query: 23 GQIDVIDGAFVLIDKTGI---RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVG 79 G+I + + +T +P+ + I + ++ A+ +Q + + Sbjct: 11 GRIRRKENTVYFETEKDGESLKTPLPINDIDTIFIFGEVDINTKAINYLSQYDIPMHFFN 70 Query: 80 EAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE---------LRFGEP 127 G Y+ L+ Q K +D+ R + E LR Sbjct: 71 YYG--YYSGSFLPRKKNVSGSLLVEQVKHHIDDSKRQMLAISFIEGAVHHILRNLRKSGI 128 Query: 128 APAR---------------RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEK 172 + +++E+L IEG+ +R Y L + + Sbjct: 129 SVENFQNIEKDLLPKIFETKTIEELMAIEGN-IREHYYQLFNTVIKNKDFFIEKREKRPP 187 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVP 230 + IN IS S +Y I P I ++H+ K S DIA+I K + P Sbjct: 188 TNPINALISFGNSIMYNTVLTEIYRTQLDPTISYLHSPQEKRFSLSLDIAEIFKPFIIDP 247 Query: 231 KAFEIARRNPGEPD 244 F + + D Sbjct: 248 LIFNLIKTGQITID 261 >UniRef50_B8GDW2 CRISPR-associated protein Cas1 n=1 Tax=Methanosphaerula palustris E1-9c RepID=B8GDW2_METPE Length = 327 Score = 72.0 bits (175), Expect = 3e-11, Method: Composition-based stats. Identities = 42/268 (15%), Positives = 88/268 (32%), Gaps = 35/268 (13%) Query: 31 AFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQ 90 +++ K G T P+G V +++ G + A ++ G + + G V Sbjct: 26 RLLIVQKNGTTTEYPIGDVHHLLVVGGHTIHSAVLQHMQNAGNWVSFFAADGTPVGLIRP 85 Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF-------------------------- 124 P +++ + A L + R R Sbjct: 86 PEDRVDEQVRAIQRHAPAHSYALGITRAALGRRLQVIGETTVVTGESPLYQGELEVLQDA 145 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISA 182 + +++++R + Y ++A+ G + R P D +N +S Sbjct: 146 RQELEYLVTLDEIRRLHRLATDMYYEIMARTIPKGTGFRRRTARP----YMDPVNTMLSF 201 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 + L GV + A IG +H G + V D+ ++ K V F + R+ Sbjct: 202 SYGILSGVCAVHLAGAHLDANIGLLHQG-ERALVRDLTELFKPQMVDQPIFALVRQGITA 260 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIED 270 D E + S + +++ ++ Sbjct: 261 SDYE--IGESRCTLSDALIRRMLLHLQT 286 >UniRef50_D2PIT7 CRISPR-associated protein Cas1 n=2 Tax=Sulfolobus RepID=D2PIT7_SULIS Length = 255 Score = 71.6 bits (174), Expect = 3e-11, Method: Composition-based stats. Identities = 33/114 (28%), Positives = 49/114 (42%), Gaps = 8/114 (7%) Query: 134 VEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEA 193 + LR IE R + L K + GR+ D IN+ I A S +Y + Sbjct: 93 LNSLRLIEAEYGRKAWDELKKFLPREFTGRK-----PRNEDPINRAIDYAYSIIYSLCTH 147 Query: 194 AILAAGYAPAIGFVHTGKP--LSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 A++A G P G +H+ P LSF YD +++ K V +RR D+ Sbjct: 148 ALIAVGLDPYAGVMHSNFPGRLSFTYDFSEMFK-SVAVHVVISSSRRVKLSLDK 200 >UniRef50_UPI0001C41A73 CRISPR-associated protein Cas1-2 n=1 Tax=Methanobrevibacter ruminantium M1 RepID=UPI0001C41A73 Length = 334 Score = 71.2 bits (173), Expect = 4e-11, Method: Composition-based stats. Identities = 39/258 (15%), Positives = 76/258 (29%), Gaps = 41/258 (15%) Query: 25 IDVIDGAFVL-IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT---LLVWVGE 80 + D V+ + I ++ ++ G ++ A+ L A+ + W G Sbjct: 12 VAKRDNQIVIKENGKEINYYLAKDISQILLTGKG-SITFDALTLLAENDVDCVSINWKGH 70 Query: 81 AGVRVYASGQP-------------GGARSDKLLYQAKLALDED----------------- 110 R+ A + + ++ Sbjct: 71 VDYRLSAPDRKNAIVKKEQYFALTDSRSGYLAKAFVRAKIENQKAVLGTLAKSREEKDYI 130 Query: 111 --LRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPK 168 R KV + ++ + + GIEG ++ A W + Sbjct: 131 IEQREKVSEHIGKIEKLSNINSDNIRNNILGIEGQASHEYWSAFASVLDEKWEF--FGRS 188 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFD 226 D +N ++ + + +I AG P GF+H+ + S VYD+ + + Sbjct: 189 GRGAKDPVNSLLNYGYAVIESEIWKSIYLAGLDPYCGFLHSERYGRASLVYDLIEEFRQQ 248 Query: 227 TVVPKAFEIARRNPGEPD 244 V I RN PD Sbjct: 249 IVDKTVLSIVNRNQITPD 266 >UniRef50_Q8F1F5 Putative uncharacterized protein n=2 Tax=Leptospira interrogans RepID=Q8F1F5_LEPIN Length = 550 Score = 71.2 bits (173), Expect = 5e-11, Method: Composition-based stats. Identities = 52/269 (19%), Positives = 93/269 (34%), Gaps = 50/269 (18%) Query: 5 PLNPIPLKDRVSMIFL--QYGQIDVIDGAFVL--IDKTGIRTH---IPVGSVACIMLEPG 57 P+ P K + + + +I D ++ + +TG ++ IP+ + + + Sbjct: 201 PIRLFPEKREKTTLHVFGHDSRIKKSDNVLLVEKVTETGEKSKSEKIPIQEIESVNIHGN 260 Query: 58 TRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR 117 ++S ++ + W G + + + Q K E +RL + + Sbjct: 261 CQISSQMIKFLVSEEIPVHWFSGGGNYIGGININPSGVQRR-IRQFKALTKETIRLNLAK 319 Query: 118 KM----------FELRFGE---------------------PAPARRSVEQLRGIEGSRVR 146 K+ + LR + S QL GIEGS R Sbjct: 320 KLVSAKCESQLRYLLRATRGKDETRNETESYLATIRSGLKNIESADSPSQLLGIEGSSAR 379 Query: 147 ATYALLAKQYG-----VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYA 201 A ++ L + NGR P D N +S S LY AI+A G Sbjct: 380 AYFSGLPALLKNSDPFLVPNGRSKRPPK----DPFNATLSFLYSLLYKSVRQAIIAVGLD 435 Query: 202 PAIGFVHTGKPLS--FVYDIADIIKFDTV 228 P+ GF HT + + V D+ ++ + Sbjct: 436 PSFGFYHTPRSSAEPLVLDLMELFRVSLC 464 >UniRef50_A5D0Y0 Uncharacterized protein n=40 Tax=cellular organisms RepID=A5D0Y0_PELTS Length = 344 Score = 71.2 bits (173), Expect = 5e-11, Method: Composition-based stats. Identities = 55/297 (18%), Positives = 91/297 (30%), Gaps = 47/297 (15%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G++ D + G R IPV IM+ V+ + + L + G Sbjct: 25 GELSRKDNTLYFETEEGRRF-IPVEDTGEIMIFGEVDVNKKLLEFLSVKEITLHFFNYHG 83 Query: 83 VRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVR--------------KMFELR-- 123 Y + L QA+ LDE+ RL + + K + R Sbjct: 84 --YYMGSFYPREHLNSGYMTLKQAEHYLDEEKRLVIAKEIVRGAAKNIRQVLKYYYGREK 141 Query: 124 -----------FGEPAPARRSVEQLRGIEGSRVRATYA---LLAKQYGVTWNGRRYDPKD 169 P R L +EG+ Y + + R P Sbjct: 142 DVGGKLNAIENLMAPIEECRDTSSLMALEGNIRDHYYRAFDEIVDNPDFAFQERSRRPPK 201 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFV--YDIADIIKFDT 227 + +N IS S +Y + + I P IG++H F DIA+I K Sbjct: 202 ----NYLNTLISFGNSLIYTICLSEIYKTHLDPRIGYLHATNFRRFTLNLDIAEIFKPII 257 Query: 228 VVPKAFEIARRNPGEPDREVR----LACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 V F + + + R + ++ R + L ++ + EI P Sbjct: 258 VDRLIFTLLGKKMITKEDFDRGTEGIMMKEKARKC-FVENLDEKLKTTINHREIGRP 313 >UniRef50_C9M4Y8 CRISPR-associated protein Cas1 n=1 Tax=Jonquetella anthropi E3_33 E1 RepID=C9M4Y8_9BACT Length = 342 Score = 70.8 bits (172), Expect = 5e-11, Method: Composition-based stats. Identities = 46/264 (17%), Positives = 81/264 (30%), Gaps = 46/264 (17%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++L Y GQI A V+ K PV + I+ ++ A+ L + Sbjct: 8 LYLSYDEGQISCSGRALVIRAKDQAPQQFPVHILEQIVCFGSVMLTPDAMNLCLANNVTI 67 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKMFE----------LRF 124 ++ G R + L Q + A D ++ LR Sbjct: 68 NYLSVYG-RFRGRISGPVRGNVLLRRMQFRRADDPVQTAELATAFLLSKISNARTVLLRH 126 Query: 125 GEPAPAR---------------------RSVEQLRGIEGSRVRATYA-----LLAKQYGV 158 + +LRG+EG + +L + Sbjct: 127 ARERETNVFDEAVRDMAGLLVKLKGFTITDLNELRGLEGDAANIYFRCFDSMILKNRETF 186 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFV 216 +++GR P D IN +S S L + + G P++GF+H +P S Sbjct: 187 SFHGRSRRPP----SDPINALLSLGYSLLAAEITGVLESVGLDPSVGFLHKDRPGRPSLA 242 Query: 217 YDIADIIKFDTVVPKAFEIARRNP 240 D+ + + V + R Sbjct: 243 LDLMEEFRAVVVDRFVLALVNREQ 266 >UniRef50_C5BP90 CRISPR-associated protein Cas1 n=3 Tax=Gammaproteobacteria RepID=C5BP90_TERTT Length = 339 Score = 70.8 bits (172), Expect = 5e-11, Method: Composition-based stats. Identities = 50/277 (18%), Positives = 88/277 (31%), Gaps = 45/277 (16%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I Q + V+ ++ IP+ ++ I VS + A+ G L Sbjct: 9 FITRQKSYVHKQRETIVVEQESEKILQIPIHAIKSIFCFGNVIVSPFLLGFCAENGVGLA 68 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF------------ 124 + E G + + G + LL + + + L V R + + Sbjct: 69 FFTEYG--RFLARIQGPQSGNVLLRRIQYEKTKSAPLDVARAIIAAKIVSSRSVLQRHIR 126 Query: 125 ---GEPAPARR---------------SVEQLRGIEGSRVRATYALLAK------QYGVTW 160 + + S+++LRG EG +++ G + Sbjct: 127 NYGSQDDVVKVIGRLKHNLEQARVDPSLDRLRGTEGVAAANYFSVFQHLVRVENDAGFAF 186 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYD 218 NGR P D IN +S S L AA+ G P +GF+HT + S D Sbjct: 187 NGRNKRPP----TDPINAMLSFLYSVLGNDISAALQGVGLDPQVGFLHTDRSGRDSLAMD 242 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIF 255 + + ++ V R + R+ A Sbjct: 243 LLEELRAWWVDRLVLTQVNRREIKA-RDFSQAVSGAV 278 >UniRef50_A4J500 CRISPR-associated protein Cas1 n=1 Tax=Desulfotomaculum reducens MI-1 RepID=A4J500_DESRM Length = 544 Score = 70.8 bits (172), Expect = 5e-11, Method: Composition-based stats. Identities = 43/292 (14%), Positives = 92/292 (31%), Gaps = 44/292 (15%) Query: 17 MIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 ++++ + ++ IP+ ++ ++L +S ++L GT Sbjct: 214 VLYVDEQGASLYKKGERVLVTKDQIKFKDIPLCNLDQVVLVGNVNLSSQLIKLFLGRGTE 273 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------LRF 124 + ++ G S + Q + ++ RL + LR+ Sbjct: 274 VHFISTKGKYYGCLQAALSKNSVLRIAQHRAYQKQEERLLYASEFVRGKLSNMRTNLLRY 333 Query: 125 GEPA-------------------PARRSVEQLRGIEGSRVRATYALLA----KQYGVTWN 161 + + +L G+EG+ R +++ + +N Sbjct: 334 NRSLNNHSIDEAVSRIKNIIKRLEKAKDLNELMGLEGAGSRDYFSVFGLLIKDRVPFDFN 393 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYD 218 R P + D N +S + S L A+ G+ P IGF+H G+P + D Sbjct: 394 KRSRRPPE----DPANALLSFSYSLLLKDVITAVQVVGFDPFIGFLHRSDFGRP-ALALD 448 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 I + + + + + F S K+ L E+ Sbjct: 449 IIEEFRPVVADSVVLTALNKGVIA-EGDFEYRMGGCFLSETGRKKMYRLYEE 499 >UniRef50_Q6L317 DNA polymerase n=2 Tax=Thermoplasmatales RepID=Q6L317_PICTO Length = 320 Score = 70.8 bits (172), Expect = 6e-11, Method: Composition-based stats. Identities = 54/263 (20%), Positives = 95/263 (36%), Gaps = 37/263 (14%) Query: 15 VSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 + Q G+I + K R H+PV +V I++ +S A+ +++G + Sbjct: 2 YNFYITQDGEITRDGNTLYFVGKDFKR-HLPVMNVNEIIISAKVSISSWALDYLSKLGIM 60 Query: 75 LVWVGEAGVRVYASGQPGGARSDK---LLYQAKLALDEDLRLKVVRKM------------ 119 + + G Y S G R++K ++ Q + L++D RL + +M Sbjct: 61 VHILNIYG--NYMSSLIPGNRNEKGNIIIMQVRSYLNDD-RLYIASQMVLGIKHNILRNL 117 Query: 120 --FELR---------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW-NGRRYDP 167 + P ++ + G EG+ Y+ Y R + P Sbjct: 118 RYYNKNNALDDKIEKISGYYPDGNNINSILGTEGNIWSTYYSAFPYIYKKYHEFKREFHP 177 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSF--VYDIADIIKF 225 D +N IS S LY +I G P+I ++H SF DI+DI K Sbjct: 178 PK----DELNAMISFGNSLLYSNVITSIFLNGLNPSISYLHEPSERSFSLALDISDIFKP 233 Query: 226 DTVVPKAFEIARRNPGEPDREVR 248 V + N + + + Sbjct: 234 VIVERVIANLVNNNIIDSNHFTK 256 >UniRef50_A4FXZ8 CRISPR-associated protein, Cas1 family n=9 Tax=cellular organisms RepID=A4FXZ8_METM5 Length = 342 Score = 70.8 bits (172), Expect = 6e-11, Method: Composition-based stats. Identities = 42/245 (17%), Positives = 82/245 (33%), Gaps = 49/245 (20%) Query: 38 TGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSD 97 IP V I++ + +S A+ LA + LV + G + Sbjct: 40 GENVQEIPAKKVEQILITCPSSISTEAISLAVEENIDLVLLKMNGKPIGRFWHSKHGSIS 99 Query: 98 KLLYQAKLALDEDLRLKVVR--------------KMFELRFGEPAPA------------- 130 + + L + +L + V+ KM + + Sbjct: 100 TIRKKQLLLSENELGITFVKEWISKKMENQIDFLKMLSMNRRDERRELLKENVLKIDEEI 159 Query: 131 --------RRSVEQLR----GIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTI 176 ++++++R G EG R + +++K +NGR +P D Sbjct: 160 KKLDNVTFNQNIDEIRNTVQGYEGYASRVYFEMISKSLPEKYQFNGRSRNPAK----DYF 215 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK--FDTVVPKA 232 N ++ LY E + + AG P IG +H +F +D+ + + D + K Sbjct: 216 NCMLNYGYGILYSQIERSCIIAGLDPYIGILHVDNYNRKAFTFDLIEKYRIYVDKTIFKM 275 Query: 233 FEIAR 237 F + Sbjct: 276 FSTKK 280 >UniRef50_A7NP58 CRISPR-associated protein Cas1 n=6 Tax=Chloroflexi (class) RepID=A7NP58_ROSCS Length = 338 Score = 70.4 bits (171), Expect = 6e-11, Method: Composition-based stats. Identities = 50/288 (17%), Positives = 91/288 (31%), Gaps = 58/288 (20%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 I G ++ + + +P+ + I++ G +S VR A+ G + ++ A Sbjct: 12 ISKHQGRIRVMKEKERLSEVPIMHLEQILICSDGVGLSSDVVRACAEEGIPIHFLNSANG 71 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLR-LKVVRKM-------------FE-------- 121 Y + G L +A+L +D R L++ + + Sbjct: 72 GDYGTFVHSGITGMALTRRAQLRAGDDERGLRLAQAFASGKIQSQANMLRYAAKNRKEND 131 Query: 122 -------LRFGEPAPARR-------------SVEQLRGIEGSRVRATYALLAKQY--GVT 159 +R + L G EG + +A+ + Sbjct: 132 PDLHNDLMRTATEILDALPPLRAVRGVLTDETRAALMGFEGMAGARYWTAVARIIPDDLG 191 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 W GR D NQ ++ L A++ AG P GF+H +P S Sbjct: 192 WPGRETR----GARDRFNQALNYGYGVLQSQVRTALILAGLDPNAGFLHADRPGKPSLTL 247 Query: 218 DIADIIKFDTVVPKA-------FEIARRNPGEPDREVRLACRDIFRSS 258 D+ + + V FEI +R+ G D + R + Sbjct: 248 DLIEEFRQAVVDRTLIGLVNRQFEIVQRDDGLLDEDTRKRIAEKILER 295 >UniRef50_D2LF35 CRISPR-associated protein Cas1 n=1 Tax=Rhodomicrobium vannielii ATCC 17100 RepID=D2LF35_RHOVA Length = 366 Score = 70.4 bits (171), Expect = 7e-11, Method: Composition-based stats. Identities = 46/253 (18%), Positives = 76/253 (30%), Gaps = 36/253 (14%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + V + ++ P+ V+ + + RV+ A + G +VW G G Sbjct: 50 AVVRVNNSTLLVERPGEPVFERPIELVSTLHIHGWARVTGACIGRLTAQGATVVWRGLHG 109 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM-----FELR-------------- 123 V + GA D Q A E L + R + +R Sbjct: 110 YPVALAQPMHGAGLDIRRAQYFEAAGERG-LAIARALISAKIQNMRGLVRRRANIEGRDC 168 Query: 124 -----FGEPAPARRSVEQLRGIEGSRVRATYA----LLAKQYG-VTWNGRRYDPKDWEKG 173 S E L GIEGS ++ + A + G V + R P Sbjct: 169 LTALAALAKKAKHASRESLLGIEGSATAFYFSAWPHMFAARAGDVEFEVRSRRPPQ---- 224 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPK 231 + +N +S A + L A+ A G P +G H + S D+ + + Sbjct: 225 NAVNATLSYAYAVLSAECVCALAAVGLDPRLGVFHQPRSGRASLALDLMEPFRPLIADQA 284 Query: 232 AFEIARRNPGEPD 244 Sbjct: 285 VLTGFNTGQIRTG 297 >UniRef50_Q2SIC8 CRISPR-associated protein Cas1 n=6 Tax=Gammaproteobacteria RepID=Q2SIC8_HAHCH Length = 338 Score = 70.4 bits (171), Expect = 7e-11, Method: Composition-based stats. Identities = 49/300 (16%), Positives = 91/300 (30%), Gaps = 56/300 (18%) Query: 11 LKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K + +++ + V+ P +VA I VS + Sbjct: 1 MKKLQNSLYVTRQESYLHKERETIVIKQGKDKLAQFPAHAVANIFCFGQISVSPFLMGYC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDL-RLKVVRKMF------E 121 + G LV+ E G + A Q + + L + D+D RL + R + Sbjct: 61 GEQGIGLVFFTEYG-KFLARIQGRQSGNVLLRREQYRVSDKDTPRLDIARSIVLAKIANS 119 Query: 122 LRFGEPAPARR------------------------SVEQLRGIEGSRVRATY---ALLAK 154 R + + +++ LRG+EG + L K Sbjct: 120 RRVLQREVRNKGAHPSLDEAIARLANCLRRCERPDNLDILRGLEGEAAAIYFGVFNQLLK 179 Query: 155 QYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL- 213 Q G + GR P D +N +S + + +A+ G P +G++H +P Sbjct: 180 QDGFDFKGRVRRPP----TDPVNALLSFLYTLVAQEISSALQGVGLDPYVGYLHVDRPGR 235 Query: 214 -SFVYDIADIIKFDTVVPKA-------------FEIARRNPGEPDREVRLACRDIFRSSK 259 DI + + F+I + R ++ K Sbjct: 236 VGLALDILEEFRAWWCDRLVLTLINRKEVKASDFDIEASGAVRLKEDARRKVLVAYQEKK 295 >UniRef50_Q74H36 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Geobacter sulfurreducens RepID=Q74H36_GEOSL Length = 559 Score = 70.4 bits (171), Expect = 8e-11, Method: Composition-based stats. Identities = 51/293 (17%), Positives = 93/293 (31%), Gaps = 55/293 (18%) Query: 5 PLNPIPLKDRVSMIFLQYGQID-VIDGAFVLIDKTGIRTHIPVGSVA----CIMLEPGTR 59 P IP R +++Q + DG ++I++ R + + + T Sbjct: 211 PRPIIPADGRGLPLYVQSPKAYVRKDGDCLVIEEE--RVRVAEARLGETSQVALFGNAT- 267 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK- 118 ++ AA+ + + W+ G + + G + YQ + + D + L + R+ Sbjct: 268 LTTAALHECLRREIPVTWLSYGGWFMGHTVSTGHRNVETRTYQYQRSFDPETCLNLARRW 327 Query: 119 -----------MFELRFGEPAPARR-------------------SVEQLRGIEGSRVRAT 148 + GE A+ S+E L GIEG+ Sbjct: 328 IVAKIANCRTLLRRNWRGEGDEAKAPPGLLMSLQDDMRHAMRAPSLEVLLGIEGASAGRY 387 Query: 149 YALLAKQY--------GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGY 200 + ++ G + R P D +N +S A + L A+ A G Sbjct: 388 FQHFSRMLRGGDGEGMGFDFTTRNRRPPK----DPVNALLSFAYAMLTREWTVALAAVGL 443 Query: 201 APAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 P GF H G+P + D+ + + VR A Sbjct: 444 DPYRGFYHQPRFGRP-ALALDMMEPFRPLIADSTVLMAINNGEIRTGDFVRSA 495 >UniRef50_Q9YCL8 Putative CRISPR-associated protein Cas1 n=1 Tax=Aeropyrum pernix RepID=Q9YCL8_AERPE Length = 327 Score = 70.0 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 30/154 (19%), Positives = 54/154 (35%), Gaps = 15/154 (9%) Query: 122 LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQC 179 LR + ++L IE R + +A+ + ++GR D D N Sbjct: 144 LRIADEDGPGF-RDKLLSIEARASRRYWQCIAEILPGRLGFSGR-----DRGALDPFNAA 197 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVP-KAFEI- 235 ++ LY + E ++L G P +G H+ K S D + + V A + Sbjct: 198 LNYGYGMLYSIVEKSLLLVGLDPYLGVFHSEKSGKPSLTLDAIEPFRAPIVDRILALKAG 257 Query: 236 ---ARRNPGEPDREVRLACRDIFRSSKTLAKLIP 266 + G D + R SS ++ + Sbjct: 258 RMYLKLEAGRLDYKSRKEVAKAVASSLSMKAAVR 291 >UniRef50_A5ILM3 CRISPR-associated protein, Cas1 family n=3 Tax=Bacteria RepID=A5ILM3_THEP1 Length = 327 Score = 70.0 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 48/291 (16%), Positives = 93/291 (31%), Gaps = 44/291 (15%) Query: 16 SMIFLQYGQIDVIDGAFV--LIDKTGIRTH--IPVGSVACIMLEPGTRVSHAAVRLAAQV 71 + G+I + + + D+ G + IPV +V I ++ + AA+ Sbjct: 4 NYYVFSSGRIRRRENSILIEYQDRDGKQQKRFIPVENVDQIFFLGEVDLNSKFLDFAAKN 63 Query: 72 GTLLVWVGEAGVRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------- 119 +L + G Y + + L+ Q + LD + RL + RK Sbjct: 64 NIVLHFFNYYG--YYTGSFYPREKFLSGELLVRQVEHYLDNEKRLSLARKFVEGAIHNFK 121 Query: 120 ----------------FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGR 163 ++ R ++ +L E + Y+ + R Sbjct: 122 RNIEKRGFDIVSKISEYQER----IKHVATIPELMSCEAHARKLYYSTWEDITDWPFEER 177 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIAD 221 P E +N IS S Y V + P + ++H K S DI++ Sbjct: 178 SMQPPLNE----LNALISFGNSLTYSVVLKELYHTHLNPTVSYLHEPGTKRFSLALDISE 233 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 I K V F++ + + +F + + + E++L Sbjct: 234 IFKPIFVDRIIFKLINLGKIKRENHFLQESNGVFLNDEGRRIFVEEFENML 284 >UniRef50_C9LM09 CRISPR-associated protein Cas1 n=1 Tax=Dialister invisus DSM 15470 RepID=C9LM09_9FIRM Length = 331 Score = 70.0 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 45/271 (16%), Positives = 92/271 (33%), Gaps = 43/271 (15%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S I++ +++ G +V+ + +P V + L ++S + + + Sbjct: 1 MSWIYVTEPGAKLNRQGGRYVISRENETICEVPSAVVEGVTLFDSIQISSSVIVDFLERN 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAK---LALDEDLRLKVVRKM---------- 119 L W+ G R + G+ +L Q + D+D L + +++ Sbjct: 61 IPLTWISSTG-RFF--GRLESTDHQNVLRQKEQFDALADKDFCLALAKRVVFGKVYNQRT 117 Query: 120 ----FELRFGEPAPARR---------------SVEQLRGIEGSRVRATYALLAKQYGVTW 160 + R +P + SVE++ G EG R + + + Sbjct: 118 ILRNYNRRAEDPFIEKVRSDIRILADKLHMAHSVEEVMGYEGMMARIYFQAIGHILPEEF 177 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVY 217 + + D N +S + L +AI+ G P IGF+H G P + Sbjct: 178 RFEKRTKRPPR--DYFNSLLSFGYTLLMYDFYSAIVNCGLHPYIGFLHALRNGHP-ALAS 234 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVR 248 D+ + + V + D V+ Sbjct: 235 DLMEPWRPAVVDAFCLSLVTHREISKDYFVK 265 >UniRef50_D1N0J7 CRISPR-associated protein Cas1 n=1 Tax=Victivallis vadensis ATCC BAA-548 RepID=D1N0J7_9BACT Length = 347 Score = 70.0 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 52/304 (17%), Positives = 92/304 (30%), Gaps = 63/304 (20%) Query: 18 IFLQY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV------SHAAVRLAAQ 70 ++L G++ D T P + V S AV + + Sbjct: 3 LYLTRSGRLRRRDNTLRFERVNLPETEDPEVEAEALEEGKADAVQALPVESIDAVYVFGE 62 Query: 71 --VGTLLV--------------WVGEA-GVRVYASGQPGGARSDKLLYQAKLALDEDLRL 113 V T L+ W G G + D ++ Q + + + RL Sbjct: 63 LSVNTKLINFLNCKKVPVHFFNWYGHHTGTLL---PHAEQLSGDLVIRQGEAYRNPEERL 119 Query: 114 KVVR--------------KMFELRFGEPAPARRSVEQL-------------RGIEGSRVR 146 + R + ++ R G + +V++L G+EG+ + Sbjct: 120 MICRNLLEAVFHNILSVLQYYQRRKGGLEKSIAAVKELEQKLAEQDTPEGLMGLEGNVRK 179 Query: 147 ATYALLAKQYGVTW--NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAI 204 Y G T R Y P D + +N IS S LY + + P I Sbjct: 180 LYYQSWPVWLGKTAEHFKRVYHPPD----NPLNALISFLNSLLYTACVSELYRTALYPGI 235 Query: 205 GFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLA 262 ++HT + S D+ + K V F + DR+ R + Sbjct: 236 SYLHTPQTRRFSLALDLVEPFKPLLVDRMIFRLLDSRAIG-DRDFRKHSNGFLLTDDARR 294 Query: 263 KLIP 266 +++ Sbjct: 295 RILQ 298 >UniRef50_A7HNI6 CRISPR-associated protein Cas1 n=1 Tax=Fervidobacterium nodosum Rt17-B1 RepID=A7HNI6_FERNB Length = 334 Score = 70.0 bits (170), Expect = 1e-10, Method: Composition-based stats. Identities = 48/298 (16%), Positives = 103/298 (34%), Gaps = 47/298 (15%) Query: 15 VSMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 +++ L+ G + DG ++ + IP+ + I L ++ + Sbjct: 1 MTIYVLEQGTVLAKKDGRMIITKAKQVLDEIPLKKIERINLLGNITLTSQMINYCLDNKI 60 Query: 74 LLVWVGEAGV---RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE--------- 121 ++++ + G ++Y L Q + A D+ +L++ + + + Sbjct: 61 EVIFMTQHGRYRGKLYT---DEYRNVLLRLKQYERATDKQFQLEISKSIVQGKLQNYYNF 117 Query: 122 -----LRFG---------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYG---V 158 E ++V+++RG EG + ++ K + Sbjct: 118 LTQKSKNLPKGLLSEERAGIRTVIEKVNKAKTVDEVRGYEGIGSKIYFSGFKKCIRTEEL 177 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFV 216 T+NGR P D IN +S LY AI A G P G +HT S + Sbjct: 178 TFNGRTAHPPK----DEINAMLSLGYYFLYVEMLLAINAVGLDPYFGNLHTIDVSKQSLL 233 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFR-SSKTLAKLIPLIEDVLA 273 +D+ + + + + + + + DI+ + + K I E ++ Sbjct: 234 FDLVEEFRCVIIDNFVLNLINLKTIKKE-DFEKRENDIYYFTKDGMKKYITEYEQMMK 290 >UniRef50_B8GSH8 CRISPR-associated protein Cas1 n=1 Tax=Thioalkalivibrio sp. HL-EbGR7 RepID=B8GSH8_THISH Length = 691 Score = 69.6 bits (169), Expect = 1e-10, Method: Composition-based stats. Identities = 44/287 (15%), Positives = 85/287 (29%), Gaps = 38/287 (13%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWVG 79 + ++ + G + P+ ++ +L P + L + L + G Sbjct: 371 EPARVGLDGGRLRVQRGEAELLSAPLETLAGVTLLGPHQVSTQLLGALLDRGIPLALATG 430 Query: 80 EAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP--------- 129 + +R G PG A + QA D+ L+ R + + R + Sbjct: 431 QGRLRGVLWNGVPGDAGPGLWMRQAACFEDDARALEAARAVVDARLRQQREVLRNRMSPE 490 Query: 130 -----------------ARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDW 170 A L G+EG R + LA+ + + GR P Sbjct: 491 RCDDLLPRLDRLIAKTAAASDRASLNGLEGQAARLYFGALAELLPPELGFTGRNRRPPR- 549 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTV 228 D N +S + L+ + + G P GF H L + D+ + + V Sbjct: 550 ---DPFNVLLSLGYTVLHAHVDTVVRLNGLYPWRGFYHQPHGLHPALASDLMEPFR-HLV 605 Query: 229 VPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 A + R + + + + + + L Sbjct: 606 ERVALNVVARGRIRV-SDFAQQGDACRIEAGARRRYLADLSERLLTP 651 >UniRef50_UPI000197AF65 hypothetical protein BACCOPRO_02409 n=1 Tax=Bacteroides coprophilus DSM 18228 RepID=UPI000197AF65 Length = 340 Score = 69.6 bits (169), Expect = 1e-10, Method: Composition-based stats. Identities = 45/274 (16%), Positives = 91/274 (33%), Gaps = 57/274 (20%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLE-----PGTRVSH 62 ++ ++ +++ I V+ IP ++ + + PG Sbjct: 1 MRKLLNTLYVTTPNAYISKDGLNIVVSVNQEEVFRIPAINIESIVTFGYMGASPG----- 55 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKM-- 119 ++L + G L ++ G + + Q + L Q +L DE L V M Sbjct: 56 -VMKLCSDSGISLTFLSPHG-KFISRVQSATKGNVLLRKKQYQLVDDEAWSLHVSLLMIG 113 Query: 120 ------------FELRFGEPAPARRSVEQLR----------------GIEGSRVRATYAL 151 + +GE ++++ L G EG A + + Sbjct: 114 GKIQNYRNILRRYIRDYGENENVNQAIQTLERAKRNALQAPDKTTLIGYEGLASNAYFEV 173 Query: 152 L-----AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGF 206 L ++ + GR P D +N +S A + + AA+ G P +GF Sbjct: 174 LPVLILNQKADFPFQGRNRRPPK----DAVNAMLSFAYTLIANDVAAALETIGLDPYVGF 229 Query: 207 VHTGKPL--SFVYDIADIIKFDTVVPKAFEIARR 238 +HT +P S D+ + ++ + + Sbjct: 230 LHTLRPGRTSLALDMMEELRAYLGDRFVLSLINK 263 >UniRef50_A3MVN2 CRISPR-associated protein, Cas1 family n=4 Tax=Thermoproteaceae RepID=A3MVN2_PYRCJ Length = 294 Score = 69.3 bits (168), Expect = 2e-10, Method: Composition-based stats. Identities = 49/273 (17%), Positives = 84/273 (30%), Gaps = 47/273 (17%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTLLVWVGE 80 YG +++++ G R P+ V +L G ++ A+R + G ++ + Sbjct: 8 YGTRIRTRKGLLVVERGGERREYPLHQVDEVFILTGGVSITSRALRALLRAGAVVAVFDQ 67 Query: 81 AGVRV------------------YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 G + YA+ GG +L+ V+ + Sbjct: 68 RGEPLGIFMRPVGDATGEKRRCQYAAA-AGGRGLQWAREWV--WKKMRGQLQNVKA-WRR 123 Query: 123 RFGE-------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 R A S ++ E + A + + G GR D Sbjct: 124 RLAHYGDYVEQIGRALEALRAAASPGEVMEAEAAAAEAYWRAYGEVTGFP--GR-----D 176 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 E GD +N ++ L + +IL AG P +GF+H K S V D + + Sbjct: 177 QEGGDPVNAALNYGYGVLKALCFKSILLAGLDPYVGFLHVDKSGRPSLVLDFMEQWRPRV 236 Query: 228 VVPKAFEI--ARRNPGEPDREVRLACRDIFRSS 258 A G D + RL Sbjct: 237 DAVVAKVAGELATENGLLDHKSRLRVAAAVLEE 269 >UniRef50_B0K547 CRISPR-associated protein Cas1 n=12 Tax=Bacteria RepID=B0K547_THEPX Length = 330 Score = 69.3 bits (168), Expect = 2e-10, Method: Composition-based stats. Identities = 48/254 (18%), Positives = 84/254 (33%), Gaps = 44/254 (17%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G++ D + R IPV + + IM+ ++ + Q +L + G Sbjct: 11 GELKRKDNTL-FFEGENGRKFIPVENTSEIMVFGEVSLNKRLLEFLTQSEIILHFFNHYG 69 Query: 83 VRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------------FELR-- 123 Y + +L QA+ D RL + +K + R Sbjct: 70 --YYVGSYYPREHLNSGYMILRQAEHYNDGSKRLYLAQKFVEGAYKNIRQVLKYYSNRGK 127 Query: 124 -----------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY----GVTWNGRRYDPK 168 GE + ++ +L IEG+ +R Y + ++ R P Sbjct: 128 DLEDVIYSIEKLGESVDSTSTINELMAIEGN-IREYYYKAFDEIIQNPDFKFDFRSKRPP 186 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFD 226 + +N IS S +Y T + I P IGF+H S D+++I K Sbjct: 187 Q----NFLNTLISFGNSLMYTTTLSEIYKTHLDPRIGFLHATNFRRFSLNLDVSEIFKPI 242 Query: 227 TVVPKAFEIARRNP 240 V F + + Sbjct: 243 IVDRTIFTLLSKKM 256 >UniRef50_Q8YZS6 Alr0381 protein n=6 Tax=Cyanobacteria RepID=Q8YZS6_ANASP Length = 374 Score = 68.9 bits (167), Expect = 2e-10, Method: Composition-based stats. Identities = 46/269 (17%), Positives = 89/269 (33%), Gaps = 45/269 (16%) Query: 18 IFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 ++L + G I ++I K + + + I++ PG +++ + G + Sbjct: 42 LYLTEPGTILRYRNESLIIMKQEKSHNCRLAEITLIVVLPGVQLTDVVISQLLDRGIETI 101 Query: 77 WVGEAG-VRVYASGQPGGARSDKL----------------------LYQAKLALDEDLR- 112 ++ + G R G + +L + ++ L R Sbjct: 102 FLRQDGQFRGRLQGHFATNMTIRLAQYRTVETTFGMALAQKLVIGKVRNQRVLLQRRNRA 161 Query: 113 -------LKVVRKM---FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWN- 161 L + + + + +L G+EG R Y L + WN Sbjct: 162 TNGQISELTEAIDLISVYASQLNNTTTP-LNRNELMGVEGICARTYYQALKHWFPTQWNF 220 Query: 162 -GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 GR P D IN +S L +A + AG P +GF H +P + V D Sbjct: 221 NGRNRRPP----LDPINALLSWGYGVLLARVFSACVQAGLDPYLGFFHAIEPYRPNLVLD 276 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREV 247 + + + VV +A ++ + Sbjct: 277 LMEEFRP-VVVDQAVISLIQSDLLTQEDF 304 >UniRef50_B2SPB2 Crispr-associated protein Cas1 n=56 Tax=Bacteria RepID=B2SPB2_XANOP Length = 344 Score = 68.5 bits (166), Expect = 3e-10, Method: Composition-based stats. Identities = 44/265 (16%), Positives = 85/265 (32%), Gaps = 48/265 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + V+ + R +PV + ++ VS + ++ G + ++ G R Sbjct: 17 LRKDGANIVMEVERQERARLPVHMLESLVCIGRVAVSPQLLGFCSEHGISICYLTPQG-R 75 Query: 85 VYASGQPGGARSDKLLYQ------------------------------AKLALDEDLRLK 114 A + + + L A+ D L Sbjct: 76 FLARVEGPVSGNVLLRRAQYRRSDDPAGCAAIVRHLLAGKIHNQRAVLARGWRDHGDCLT 135 Query: 115 VVRKMFE-----LRFGEPAPARRSVEQLRGIEGSRVRATYALL-----AKQYGVTWNGRR 164 V R + V+ LRG+EG ++ + + A + + + GR Sbjct: 136 DVAAFQHSLKRLKRIPQRVLVETDVDVLRGLEGEAAQSYFGVFGQLVRADKPLLRFGGRN 195 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 P D N +S + L +A+ + G PA+GF+H +P S D+A+ Sbjct: 196 RRPPR----DAFNALLSFLYTLLTHDCRSALESVGLDPAVGFLHRDRPGRPSLALDLAEE 251 Query: 223 IKFDTVVPKAFEIARRNPGEPDREV 247 + A + R +R+ Sbjct: 252 FRPLLGERLALSLINRRQLN-ERDF 275 >UniRef50_A1WUP2 CRISPR-associated protein Cas1 n=1 Tax=Halorhodospira halophila SL1 RepID=A1WUP2_HALHL Length = 320 Score = 68.5 bits (166), Expect = 3e-10, Method: Composition-based stats. Identities = 46/249 (18%), Positives = 89/249 (35%), Gaps = 43/249 (17%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ ++++ A + + +P+ + +++ +S + A+ G L Sbjct: 4 LYIDRRRTRLELAHKALTIREPEAQPRSVPLSLIDRLIVIGQVELSSGVLTTLAESGVSL 63 Query: 76 VWVGEAGVRVYASGQP-GGARSDKLLYQAKLALDEDLRLKVVRKMFELRF---------- 124 V++ G R A + G + + L Q +L E R R++ LR Sbjct: 64 VFMPSRGQRRSAFLRSEGHGDAVRRLGQYRLIHLEAERQAWARRLVRLRLAGQQRLLASA 123 Query: 125 ----GEPAPARRSV------------------EQLRGIEGSRVRATYALLAKQYG--VTW 160 + + EQLRG EG+ A + + + + Sbjct: 124 LYRRPDQRQPLTAAHREIEAAQATVRREAPAGEQLRGQEGTAAAAFFRGYGALFAEALGF 183 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYD 218 +GR P D +N +S + +G A+ AAG PAIG +H S D Sbjct: 184 SGRNRRPPR----DPVNAVLSLGYTLAHGDALRAVTAAGLDPAIGVLHEPAWGRDSLACD 239 Query: 219 IADIIKFDT 227 + +I + Sbjct: 240 LTEIARARV 248 >UniRef50_Q2LQX3 Uncharacterized protein predicted to be involved in DNA repair n=1 Tax=Syntrophus aciditrophicus SB RepID=Q2LQX3_SYNAS Length = 386 Score = 68.1 bits (165), Expect = 4e-10, Method: Composition-based stats. Identities = 39/252 (15%), Positives = 76/252 (30%), Gaps = 37/252 (14%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + Q +I ++ + + + +++ ++H A+ + G +V Sbjct: 4 YVRTQGARIIKEGRHLLVRKGDAVYHTLFTYKLDQLVIFGNVEITHQALAQLMRYGIDVV 63 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------LRF-- 124 ++ G + P Q L +E L+ VR + +R Sbjct: 64 FLSFRGRYLGRISPPESKNVFLHKRQYSLLGNETFTLRQVRAIVAGKLANMATLLMRIKR 123 Query: 125 ----------GEPAPA-------RRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRY 165 + SV+ LRG EG + + + + R Sbjct: 124 SRNVSLASQKAHEIQSLIRLLFQAESVDSLRGYEGRGSALYFEAFGRGFIENQGFFRRVR 183 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADII 223 P D +N +S + L AA+ AG P GF+H S V D+ + Sbjct: 184 RPP----TDPVNSVLSLLYTFLMNRVYAAVRVAGLDPYPGFLHALDYGRYSLVLDLMEEF 239 Query: 224 KFDTVVPKAFEI 235 + + Sbjct: 240 RTIIADTLTLSL 251 >UniRef50_Q6L363 DNA polymerase n=1 Tax=Picrophilus torridus RepID=Q6L363_PICTO Length = 318 Score = 68.1 bits (165), Expect = 4e-10, Method: Composition-based stats. Identities = 41/245 (16%), Positives = 91/245 (37%), Gaps = 28/245 (11%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 + + G ++ + + + IP+ ++ I++ ++ A+ + ++ + Sbjct: 3 TYYIISSGTLNREMDSLRFSNSNENKI-IPLENIESIIISGNVSITKPAISILSKKNIPV 61 Query: 76 VWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVR--------------- 117 ++ Y S ++ QA A + D R+K+ R Sbjct: 62 FFMSMYD--NYISSLIPEDYLLSGKVIMNQAIKAYNIDERIKIARIFVYAAARNMAIVLK 119 Query: 118 --KMFELRFG-EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGD 174 M +++ + +S+ +L EG+ R + + QY + N + + Sbjct: 120 RGNMGKIKIPYKEIMESKSINELMSYEGN-FRNNFLNIVDQY-LPNNYKIIKRSRRPPRN 177 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKA 232 +N IS LY +TE+ I PAI F+H + S D+++I K A Sbjct: 178 KMNALISYLNMLLYSITESQIFLTHLNPAISFLHEPFERRNSLSLDVSEIFKPLICDRLA 237 Query: 233 FEIAR 237 ++ + Sbjct: 238 IKMVK 242 >UniRef50_B6IX22 CRISPR-associated protein Cas1, putative n=2 Tax=Rhodospirillum RepID=B6IX22_RHOCS Length = 350 Score = 67.7 bits (164), Expect = 4e-10, Method: Composition-based stats. Identities = 46/241 (19%), Positives = 75/241 (31%), Gaps = 38/241 (15%) Query: 45 PVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAK 104 P ++L T ++ A+RL + + G P +D L+Q Sbjct: 48 PHRLETLVLLGQ-TALTPNAMRLCMAHRITVALLDGGGNLAARVVPPEARTADLRLHQYT 106 Query: 105 LALDEDLRL------------KVVRKMFELRFGEPAPA-----------------RRSVE 135 L + RL + +R + P + E Sbjct: 107 LHHTPEERLIRARGVVAAKLHNAAEVLRAVRSNQSNPDIARAIAEVERTAATVADAVTPE 166 Query: 136 QLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEA 193 L GIEG+ R +A L + + + GR P D N +S L Sbjct: 167 TLLGIEGNGARQYFAGLRAAFVGDIRFTGRAQRPP----PDPANSMLSFGYVLLGNRIAG 222 Query: 194 AILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC 251 + A G P +GF H +P S D+ + ++ V + +PD A Sbjct: 223 LLEARGIDPCLGFFHALRPGRPSLALDLLEELRQPVVDRLVLRLCNLRMLKPDMFEADAE 282 Query: 252 R 252 R Sbjct: 283 R 283 >UniRef50_D0KYZ2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=D0KYZ2_HALNC Length = 344 Score = 67.7 bits (164), Expect = 5e-10, Method: Composition-based stats. Identities = 48/261 (18%), Positives = 88/261 (33%), Gaps = 49/261 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + + + + + +P+ + ++ +S A + A G LV + +G Sbjct: 17 VHLENATVRIDVEREKKLQVPLHHLNGLVCFGNIMISPALMHRLADDGKSLVLMDSSG-- 74 Query: 85 VYASGQPGGARSDKLLYQA--KLALDEDLRLKVVRKMFELRFGE---------------- 126 + + G + LL QA + A D L++ R + + Sbjct: 75 RFKARLEGPVSGNILLRQAHHRQASDAAFALEIARTIVSGKLKNSRSVVQRGARETSDTI 134 Query: 127 -----------------PAPARRSVEQLRGIEGSRVRATYALL------AKQYGVTWNGR 163 A A S+++LRGIEG R ++ + A + NGR Sbjct: 135 ETTQLTRSADNLAASLRAAAAATSMDELRGIEGEAARGYFSAINLIVKTAMRANFQLNGR 194 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIAD 221 P D N IS + L +AI A G +GF+H +P + D+ + Sbjct: 195 TRRPP----LDRFNALISFLYAMLMNDCRSAIEATGLDAQLGFLHAVRPGRAALALDLME 250 Query: 222 IIKFDTVVPKAFEIARRNPGE 242 + A + R Sbjct: 251 EFRAIAADRLALTLINRGQIN 271 >UniRef50_Q7MRD4 Putative uncharacterized protein n=1 Tax=Wolinella succinogenes RepID=Q7MRD4_WOLSU Length = 314 Score = 67.7 bits (164), Expect = 5e-10, Method: Composition-based stats. Identities = 67/275 (24%), Positives = 106/275 (38%), Gaps = 23/275 (8%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTLLVWVGEA 81 + V D V+ + G RT +P+ + +++E P +S A + AQ GT L+ + Sbjct: 13 CHLRVRDRQLVIESREGERTQLPLADIGVVIVENPQVTLSAALLSALAQEGTALMSCDSS 72 Query: 82 GVR--VYASGQPGGARSDKLLYQAKLALDEDL--------RLKVVRKMFELRFGEPAPAR 131 + V+A + QA + + R KV + LR A Sbjct: 73 HLPDGVFAPFGTHSRHTRAARVQA-VWSEPFKKRCWQKIVRAKVSSQAEVLRRVGAEDAA 131 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKG-DTINQCISAATSCLYGV 190 R +E L G S T + A+ V W + K W+ G D N ++ + L G Sbjct: 132 RRLENLVGKVTSG--DTTGVEAQAAQVYWRSLFVNFKRWDGGLDARNGGLNYGYAVLRGA 189 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF--DTVVPKAFEIARRNPGEPDRE 246 AI AG PA G H G+ SF D+ + + D +V F + R RE Sbjct: 190 LGRAIAGAGLIPAFGLHHAGELNSFNLADDLIEPFRPFVDLLVWGLF-LERTGDDPLTRE 248 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 R+ I S + E +L+A +I + Sbjct: 249 ERMRIAQILGESCIMT---ERYETLLSATQIIARS 280 >UniRef50_C0BZ41 Putative uncharacterized protein n=3 Tax=Clostridiales RepID=C0BZ41_9CLOT Length = 343 Score = 67.3 bits (163), Expect = 6e-10, Method: Composition-based stats. Identities = 44/271 (16%), Positives = 95/271 (35%), Gaps = 48/271 (17%) Query: 11 LKDRVSMIFLQYGQ--IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++ +++ + + V+++ +P+ ++ I+ T S A + Sbjct: 1 MKKLLNTLYVTSTNRYLFLDGENVVILEDQEEIGRVPLHNLEGIVTFGYTGASPALMGAC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGA-RSDKLLYQAKLALDEDLRLKVVRKM-------- 119 A+ L ++ G R A Q +++ + +K+ R Sbjct: 61 AERNIDLSFMSGNG-RFLARVSGEVRGNVTLRKEQYRISEQKKESIKIARNFITGKVYNA 119 Query: 120 ----------FELRFGEPAPARRSV---------------EQLRGIEGSRVRATYAL--- 151 + LR +S E+L GIEG +++ Sbjct: 120 KWVLERAARDYPLRLDVDRIKEKSAFMSGNLPKIRECEDAERLLGIEGESASLYFSVFDE 179 Query: 152 --LAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 L ++ ++ GR P D +N +S A + L G+ +A+ + G P +GF HT Sbjct: 180 LILQQKDEFSFGGRNKRPP----LDNVNAMLSFAYTLLTGMCASALESVGLDPYVGFYHT 235 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARR 238 +P S D+ + ++ + + Sbjct: 236 DRPGRVSLALDLMEELRSVMADRFVLTLINK 266 >UniRef50_C0FSR1 Putative uncharacterized protein n=1 Tax=Roseburia inulinivorans DSM 16841 RepID=C0FSR1_9FIRM Length = 359 Score = 67.3 bits (163), Expect = 6e-10, Method: Composition-based stats. Identities = 52/295 (17%), Positives = 97/295 (32%), Gaps = 44/295 (14%) Query: 15 VSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 + + G + D + + G + IPV + + + +S QVG Sbjct: 32 HNYHLINEGILTKQDFNILFESENGKKY-IPVETTDSLYIYSNVIMSGNFFDFMNQVGLN 90 Query: 75 LVWVGEAGVRVYASGQPGGARS-DKLLYQAKLALDEDLRLKVVRKM-------------- 119 + ++ + G ++ + R+ L Q ++ E RL + R++ Sbjct: 91 VSFINKYGEKIGSFVPNNSRRNIKTELKQLRMYDSEKERLDMARRLEIASVSNIRANLRY 150 Query: 120 FELR---------------FGEPAPARRSVEQLRGIEGSRVRATYA-----LLAKQYGVT 159 ++ R R + + +E + Y L KQ+ Sbjct: 151 YQRRKNATELGAAVKDMTDIITKLNEARDINHMMMLEAQARQKYYGCFNSILEGKQFY-- 208 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVY 217 ++ R P D +N IS + LY I IG VH +P S Sbjct: 209 FDKRTRRPPQ----DPLNAMISFGNTLLYQRIANEINRTSLDIRIGIVHAAGNRPESLNL 264 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 D+AD+ K V F + R + V + I+ +++ I E+ L Sbjct: 265 DLADLFKPILVDRTIFTLVNRKMINVNDFVEVENNGIYLNNRAKKIFISEYENKL 319 >UniRef50_C9LGP6 CRISPR-associated protein Cas1 n=3 Tax=Prevotella RepID=C9LGP6_9BACT Length = 295 Score = 67.3 bits (163), Expect = 7e-10, Method: Composition-based stats. Identities = 40/267 (14%), Positives = 99/267 (37%), Gaps = 26/267 (9%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKT--GIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVG 72 S++F+ + + +G V+I K +P+ + +M+ ++ + + Sbjct: 5 SLVFMHPATLSLRNGQMVIIRKEIPDDNLIVPIEDIGLVMINHAMVSLTIPLLNALTEQN 64 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK-MFELRFGEPAPAR 131 +++ E G+ + + + Q + ++ +V++K +++ + Sbjct: 65 VAVIFCNEKGMP---ASMLYNLQGNTT--QGETLHNQLEAGEVLKKTLWKQIIEAKIKNQ 119 Query: 132 RSVEQLRGIEGSRVRATYALL------------AKQYGVTWNGRRYDPKDWEKGDTINQC 179 ++ G EGS ++ Y + A+ Y GR + G IN Sbjct: 120 AALLNKMGKEGSILKPLYTNVKSGDSDNREGIAARLYWTALFGRDFIRDRNIPG--INSL 177 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIAR 237 ++ S L A++++G PA+G H + +F D+ + + V FE+ Sbjct: 178 LNYGYSVLRAAVTRALVSSGLFPALGIFHHHRSNAFPLSDDLMEPFRP-FVDEIVFELTA 236 Query: 238 RNPGEPDREVRLACRDIFRSSKTLAKL 264 + E + + + +K+ Sbjct: 237 QGEAELNTATKSRLIRVLYVDTYFSKI 263 >UniRef50_C7RP03 CRISPR-associated protein Cas1 n=1 Tax=Candidatus Accumulibacter phosphatis clade IIA str. UW-1 RepID=C7RP03_9PROT Length = 339 Score = 67.3 bits (163), Expect = 7e-10, Method: Composition-based stats. Identities = 58/329 (17%), Positives = 117/329 (35%), Gaps = 62/329 (18%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ + A V + +P+ ++ + + +S A + + G + Sbjct: 4 LYVDRRGVTLKADGEALVFYENGERIGTVPLAPLSRVFMRGDVTLSSALLGKLGERGIGV 63 Query: 76 V-WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG--------- 125 V G V G+P + + + Q +L+LD D L+ R + E + Sbjct: 64 VVLSGRKAVPTMLLGRP-HNDAARRVAQYRLSLDGDFCLRFARAIVEAKLRAQATFLTER 122 Query: 126 -EPAPARR---------------------SVEQLRGIEGSRVRATYALLAKQY--GVTWN 161 + P R S+ LRG+EG+ A + A + ++ Sbjct: 123 RDSEPHSRYLLTLSLRRLATSIAAVDEQGSLGSLRGLEGAGAAAYFEGFADLLPERLKFS 182 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYD 218 GR P D +N +S + L+ A+ AG P +GF H G+ S D Sbjct: 183 GRNRRPPR----DPVNAMLSLGYTLLHAEAVLALYGAGLDPFVGFYHALDFGRE-SLACD 237 Query: 219 IADIIKFDT-----VVPKAFEIARRNPGEPDR--EVRLACRDIF---------RSSKTLA 262 + + ++ + ++ ++ ++ + + A R F R K LA Sbjct: 238 LVEPLRVEVDRHALMLFRSEKLRAEGFSTTESGCLLGKAGRARFYGEWEEVAARLRKLLA 297 Query: 263 KLIPLIED-VLAAGEIQPPAPPEDAQPVA 290 + + + ++ A ++ P P P A Sbjct: 298 ESVSDVASAIMQAAAVEAPCSPPIGDPAA 326 >UniRef50_Q467D6 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Methanosarcina barkeri str. Fusaro RepID=Q467D6_METBF Length = 550 Score = 67.0 bits (162), Expect = 8e-10, Method: Composition-based stats. Identities = 36/239 (15%), Positives = 77/239 (32%), Gaps = 29/239 (12%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + V+ +P+ ++ + + +S +R ++ + + G Sbjct: 234 VHKKGDRLVIKKNDEELQSVPLRQISQLSIYGDAHISLPVLRSLIEMNVPVCYFSFGGWF 293 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE---------LRFGEPAPARRSVE 135 S D ++Q + A D + L + RKM LR + + + + Sbjct: 294 YGLSHGVMSKNVDLRIHQYQTAFDSERSLAISRKMIAGKIKNCRTLLRRNDTEVSEKILS 353 Query: 136 QL----------------RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQC 179 QL GIEG+ + ++ + + + D +N Sbjct: 354 QLNSLEKKASNAKEIGQLLGIEGTAAQIYFSRFGNMLKQDLDCKFENRNKRPPTDPVNAV 413 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEI 235 +S L + + G+ P +GF H GKP + D+ + + A + Sbjct: 414 LSYLYGILTKEVFVTLFSVGFDPYMGFYHQPKYGKP-ALALDLMEEFRPLIADSVALTL 471 >UniRef50_A3CTI4 CRISPR-associated protein Cas1 n=1 Tax=Methanoculleus marisnigri JR1 RepID=A3CTI4_METMJ Length = 392 Score = 67.0 bits (162), Expect = 8e-10, Method: Composition-based stats. Identities = 50/282 (17%), Positives = 92/282 (32%), Gaps = 32/282 (11%) Query: 26 DVIDGAFVLIDKTGIRTH-IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + LI G T P+ +V +++ G + +AV + G + G Sbjct: 97 HIKATTRELIIARGSDTRRYPIQAVKHLLIVGGHTLHTSAVTNLLKAGAAITIFDIDGTP 156 Query: 85 ---VYASGQPGGARSDKLL--------YQAKLALDEDLRLKVVRKMF-----------EL 122 +Y G Q RL ++ +++ EL Sbjct: 157 VGYLYPYGYRPDESVRLAQERAGPHRFAQPLARASLQSRLLLLEELYDHAGHDIFYAGEL 216 Query: 123 RF----GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 F E A ++E LR + Y +L++ RR + + D +N Sbjct: 217 DFLHQAREELSASVTMENLRRLSRLTTDMYYEILSRTLPPELGFRRRTSRPYL--DPVNA 274 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR 238 + + +YG +++ A P +G +H G SFV+D+ + K V AR Sbjct: 275 MFALGYAMIYGNCCVSVVGAHLDPDLGMLHEG-AGSFVHDLIEPQKASMVDRAVIRFARE 333 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 D E + + ++L + D + I Sbjct: 334 EISSGDYEC--GEKRCYLGGDLSSRLAAALRDSIDQARIDAQ 373 >UniRef50_C3MWK6 CRISPR-associated protein Cas1 n=6 Tax=Sulfolobus RepID=C3MWK6_SULIM Length = 341 Score = 67.0 bits (162), Expect = 8e-10, Method: Composition-based stats. Identities = 42/272 (15%), Positives = 94/272 (34%), Gaps = 45/272 (16%) Query: 4 LPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 + +KD + + ++ G I + D+ ++ I + I++ + +S Sbjct: 42 MDKKIAFVKDYGAYLKIEKGLI-----TCKIKDQ--VKWSIAPTELHSIIVLTNSSISSE 94 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPG--GARSDKLLYQAKLALDEDLRLKVVR---- 117 V++A + G +V+ YA P L Q + ++ + Sbjct: 95 VVKVANEYGIEIVFFNNN--EPYAKLIPAKYAGSFKVWLKQLTAW--KRRKVDFAKAFIY 150 Query: 118 ----------KMFELRFGEPAPARR------------SVEQLRGIEGSRVRATYALLAKQ 155 + +E ++G ++ + E++ E + + + Sbjct: 151 GKVHNQWVTLRYYERKYGYDLKSQELDRLAREVMFVNTAEEVMQKEAEAAKVYWRGVKSL 210 Query: 156 Y--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGK 211 + + GRR D D N+ ++ L V A+++ G P IGF+H Sbjct: 211 LPKSLGFKGRRKRVSD--NLDPFNRALNIGYGMLRKVVWGAVISVGLNPYIGFLHKFRSG 268 Query: 212 PLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 +S V+D+ + + V K R + + Sbjct: 269 RISLVFDLMEEFRSPFVDRKLIGFVRESADKI 300 >UniRef50_B4W4R1 CRISPR-associated protein Cas1 n=1 Tax=Microcoleus chthonoplastes PCC 7420 RepID=B4W4R1_9CYAN Length = 354 Score = 67.0 bits (162), Expect = 9e-10, Method: Composition-based stats. Identities = 43/270 (15%), Positives = 97/270 (35%), Gaps = 41/270 (15%) Query: 11 LKDRVSMIFLQYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 ++ ++ ++ G I F++ + +P+ V I++ ++S +++ Sbjct: 18 HREMAAIYLIEQGTTIYKEYQRFIIYVSEKPKLEVPIREVQQILVFGNIQLSTPVMQVCL 77 Query: 70 QVGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKM-FE------ 121 + +V++ ++G R + D+ L Q + D + +V + + + Sbjct: 78 REQIAVVFLSQSG-RYHGHLWSSEFRDLDQELVQVRRWGDAAFQFQVSQAIVYGKLMNSK 136 Query: 122 ---LRFG-------------------EPAPARRSVEQLRGIEGSRVRATYALLAK---QY 156 LRF E S+++LRG EG + L + Sbjct: 137 QLLLRFNRKRKLPDVERAIIGINQDIEALEFSESLDRLRGYEGIGAARYFPALGQLITNS 196 Query: 157 GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS-- 214 ++ R P D +N +S + L+ I+A G +P +G H G+ Sbjct: 197 RFEFSLRNRQPP----TDPVNSLLSFGYTLLFNNVLGFIIAEGLSPYLGNFHYGERQKPY 252 Query: 215 FVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +D+ + ++ V I + +P Sbjct: 253 LAFDLMEEMRSVVVDSLVLNIVNHSLFKPQ 282 >UniRef50_C7G6C1 CRISPR-associated protein Cas1 n=3 Tax=Firmicutes RepID=C7G6C1_9FIRM Length = 334 Score = 66.6 bits (161), Expect = 1e-09, Method: Composition-based stats. Identities = 45/264 (17%), Positives = 91/264 (34%), Gaps = 46/264 (17%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ Q I + F + K G+ IP ++ I + +++ + + G Sbjct: 1 MSYLYVSEQGASIGIEANRFQVNYKDGMIKSIPAETLEMIEVFGSVQITTRCLTECLKRG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQ---AKLALDEDLRLK--------------V 115 +++ +G Y G+ + Q A++ +E +L+ V Sbjct: 61 VNILFYSTSGA-YY--GRLISTSHVNVQRQRIQAEIGHNETFKLEMSKRIIDAKIRNQVV 117 Query: 116 VRKMFELRFGEPAPAR---------------RSVEQLRGIEGSRVRATYALLAKQYGVTW 160 V + + R + R +SVEQ+ G EG+ + + +L K + Sbjct: 118 VLRRYA-RGRDEDIHRMIIEMQNMQKKLLYAKSVEQVMGYEGTAAKIYFKVLGKLIDEQF 176 Query: 161 --NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFV 216 GR P D N IS S + I G P G +H + + Sbjct: 177 VFEGRSRRPP----MDPFNSLISLGYSIILNELYGKIEGKGLNPYFGVMHKDREKHPTLA 232 Query: 217 YDIADIIKFDTVVPKAFEIARRNP 240 D+ + + + A + + Sbjct: 233 SDLMEEWRAVLIDTTALSMLNGHE 256 >UniRef50_B9M9X7 Putative uncharacterized protein n=1 Tax=Diaphorobacter sp. TPSY RepID=B9M9X7_DIAST Length = 302 Score = 66.6 bits (161), Expect = 1e-09, Method: Composition-based stats. Identities = 39/249 (15%), Positives = 81/249 (32%), Gaps = 23/249 (9%) Query: 31 AFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASG 89 F L+ + +P +A I+L ++H + + G L G+ Sbjct: 18 HFALVVEQEQSARVPFEDIAVIVLNHREITLTHPVLSACGEYGIGLYSTGDNHQPNGVFM 77 Query: 90 QPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGS------ 143 + + +L LD+ + + +++ G R +E L G G+ Sbjct: 78 PFLQHSRATRMQRLQLDLDKPSAKRAWAHIVQVKIGNQ---ARCME-LLGTVGTDRLASY 133 Query: 144 --RVRA-----TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAIL 196 RVR+ A + Y GR + N + + + G A++ Sbjct: 134 ARRVRSGDGGNLEAQASAYYFPQVFGRSFHRSQTGWS---NAALDYGYAVMRGACARALV 190 Query: 197 AAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDI 254 A G P++G H + +F D+ + + + A E ++A + Sbjct: 191 AHGMLPSLGLFHRSEQNAFNLADDLIEPYRPVVDLHVAQHRPADEDAELQPSDKVALVGL 250 Query: 255 FRSSKTLAK 263 + + Sbjct: 251 LNVDVAMPR 259 >UniRef50_A1HM55 CRISPR-associated protein Cas1 n=2 Tax=Thermosinus carboxydivorans Nor1 RepID=A1HM55_9FIRM Length = 332 Score = 66.2 bits (160), Expect = 1e-09, Method: Composition-based stats. Identities = 51/264 (19%), Positives = 86/264 (32%), Gaps = 45/264 (17%) Query: 18 IFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ I G F++ I IP+ + ++L +VS + + G L Sbjct: 4 LYVTDAGSHIQKNAGRFLVCKGDTILREIPLELLDNVVLFGSIQVSAKTITEFLKRGITL 63 Query: 76 VWVGEAGVRVYASGQPGGARSD---KLLYQAKLALDEDLRLKVVRK---------MFELR 123 W+ + G Y G+ R Q ++ D LK+ + M LR Sbjct: 64 TWLSKTG-EFY--GRLESTRHIDIFLHRQQIRMGDRPDFCLKIAQAIIDAKIANCMTILR 120 Query: 124 --------------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWN 161 E P +E L G+EGS R + LA + Sbjct: 121 RYQRTANSPEVADHIHAMGIIAEKIPNVDKIETLLGLEGSAARHYFTALACLVPDDFAFK 180 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDI 219 GR P D N +S + L + AG P GF+H + + V D+ Sbjct: 181 GRNKQPPK----DPFNSLLSFGYTLLMYDFYTIVQNAGLHPYAGFLHKDRQGHPTLVSDL 236 Query: 220 ADIIKFDTVVPKAFEIARRNPGEP 243 + + + + R +P Sbjct: 237 MEEWRPSIIDSLVMSLIHRREIQP 260 >UniRef50_Q0AW57 CRISPR-associated protein, Cas1 family n=1 Tax=Syntrophomonas wolfei subsp. wolfei str. Goettingen RepID=Q0AW57_SYNWW Length = 336 Score = 66.2 bits (160), Expect = 1e-09, Method: Composition-based stats. Identities = 46/265 (17%), Positives = 93/265 (35%), Gaps = 48/265 (18%) Query: 15 VSMIFL-QY-GQIDVIDGAFVLIDKTGIRTHI-PVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S +++ + +I V + V+ K I P+ V +++ +S V+ + Sbjct: 1 MSFLYVYERSAKIGVQENCVVVESKKENLKRILPIEGVENVIIFGDASLSSNCVKQFMER 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLAL---DEDLRLKVVRKM--------- 119 L W+ G + Y G+ R+ + Q K D++ L + +++ Sbjct: 61 DINLTWLSSRG-KFY--GRLESTRNVNIYRQRKQFACGEDDEFCLALAKRIILAKVKNQI 117 Query: 120 -----FELRFGEPAPARR---------------SVEQLRGIEGSRVRATYALLAK--QYG 157 + E + + + ++L G EG R Y LA+ + Sbjct: 118 TILRRYRRNRPEKSVQKIIDAMAKLLPIMERVHNKDELMGHEGMAARYYYQGLAELVEPD 177 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLS 214 ++GR P D N +S A + L A + G P F+H+ G P + Sbjct: 178 FAFSGRNRQPPR----DPFNSLLSFAYTLLMYDLYTAAVNRGLNPYASFLHSIRRGHP-A 232 Query: 215 FVYDIADIIKFDTVVPKAFEIARRN 239 D+ + + A + + Sbjct: 233 LCSDLMEEWRAILADSLALYVTSKG 257 >UniRef50_B9YDC3 Putative uncharacterized protein n=1 Tax=Holdemania filiformis DSM 12042 RepID=B9YDC3_9FIRM Length = 343 Score = 66.2 bits (160), Expect = 1e-09, Method: Composition-based stats. Identities = 39/260 (15%), Positives = 90/260 (34%), Gaps = 48/260 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + V++++ + IP+ +++ I++ +S A + L ++ AG Sbjct: 17 LSLDGENVVILNQQKVIKRIPLHNLSSIVMFNYQGISPALMGKCMSQNITLSFLSPAG-- 74 Query: 85 VYASGQPGGARSDKLLY--QAKLALDEDLRLKVVRKM------------------FELRF 124 + G + + LL Q ++ + L + + M + LR Sbjct: 75 YFLGRVVGEYQGNVLLRKKQILVSENNQQSLLIAKNMILAKVYNSKWILERAIRDYPLRI 134 Query: 125 GEP---------------APARRSVEQLRGIEGSRVRATYAL-----LAKQYGVTWNGRR 164 +S ++LRG+EG+ + ++ L ++ + R Sbjct: 135 DIEKMRVIFKKLSEILNSIEKVQSHDELRGLEGTAAKLYFSSFDDLILRQKEDFVFTTRT 194 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP--LSFVYDIADI 222 P + +N +S + L +A+ + G IGF H +P +S D+ + Sbjct: 195 RRPP----LNKVNALLSFVYTLLSHDCASALESVGLDCYIGFFHVDRPGRMSLALDLMEE 250 Query: 223 IKFDTVVPKAFEIARRNPGE 242 + + R + Sbjct: 251 FRPCLADRFVLSLINRKEID 270 >UniRef50_C3WS02 CRISPR-associated protein n=2 Tax=Fusobacterium RepID=C3WS02_9FUSO Length = 335 Score = 65.8 bits (159), Expect = 2e-09, Method: Composition-based stats. Identities = 40/291 (13%), Positives = 99/291 (34%), Gaps = 40/291 (13%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I+ Q + + ++ G IP+ ++ +++ G ++S A + G + Sbjct: 5 YIYEQGIVLRYKENRLLITYTNGDYKSIPIENIDNVVIFGGIQLSTACMHNLLIKGIHVT 64 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAK------------------------------LA 106 ++ + G G+ + + Q + + Sbjct: 65 FLSKTGSYF---GRLESTSNINIDRQREQFRKSDDKKFCLAIGKKFIKGKATNQRTLLIR 121 Query: 107 LDEDLRLKVVRKMFELRFG--EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRR 164 ++DL+ +++ + FG + +++E+L G+EG R + + ++ + Sbjct: 122 ANKDLKSEILSSVINSMFGIIKDINDSKTIEELMGVEGYLARLYFNAINHIIDKKYSFKT 181 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 + + D N IS + L+ ++ G P F+H+ + + D+ + Sbjct: 182 RTKRPPK--DPFNAVISFGYTLLHYEIFTTLVTKGLNPYAAFLHSDRHKHPALCSDLMEE 239 Query: 223 IKFDTVVPKAFEIARRNPGEPDR-EVRLACRDIFRSSKTLAKLIPLIEDVL 272 + V A + N + +F + K K + E L Sbjct: 240 WRAILVDSMAIALLNNNKIAYEDFNFDEKSGGVFLNKKACGKFVEQFEKRL 290 >UniRef50_B7A8Y4 CRISPR-associated protein Cas1 n=1 Tax=Thermus aquaticus Y51MC23 RepID=B7A8Y4_THEAQ Length = 316 Score = 65.8 bits (159), Expect = 2e-09, Method: Composition-based stats. Identities = 35/152 (23%), Positives = 54/152 (35%), Gaps = 8/152 (5%) Query: 134 VEQLRGIEGSRVRATYALLAKQYGVT-WNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 +E LRG EG RA + L + + GR P D +N +S + L G Sbjct: 137 LESLRGAEGEGSRAYFQGLGRLLAAQGFGGRTRRPPR----DPVNAALSYGYALLLGRVL 192 Query: 193 AAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A+ AG P +GF+H S D+ + + V RR P Sbjct: 193 VAVRLAGLHPEVGFLHAEGRRSPALALDLMEEFRVPVVDAVVLSAFRRGHLTP-AHAEAR 251 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAP 282 ++ + + +LI L+E P Sbjct: 252 EGGVYLNEEGRRRLIELLEGRFLEEVAHPLGF 283 >UniRef50_Q2NH78 Putative uncharacterized protein n=1 Tax=Methanosphaera stadtmanae DSM 3091 RepID=Q2NH78_METST Length = 332 Score = 65.8 bits (159), Expect = 2e-09, Method: Composition-based stats. Identities = 47/288 (16%), Positives = 97/288 (33%), Gaps = 39/288 (13%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G + + +K + T IP+ ++ I + A+ L + ++ + + G Sbjct: 19 GILYRKENTLKFKNKE-VDTSIPIHAINEINCYGKVSLRSGAISLLQKEKIIINFFNKYG 77 Query: 83 VRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------------FELRFG 125 Y + ++ QA +E+ R + ++M ++ + Sbjct: 78 --YYEGSLYPKIALNSGVIVVKQALTYNNENKRCFIAKEMVNGMKHNMIKTLKYYKKKGK 135 Query: 126 EPAPARRSVE----------QLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEKGD 174 + +++E ++ EG ++ Y R + P E Sbjct: 136 DVDEHIQNLENESILDGNINRILSSEGKLWQSYYPSFDNITKKFPIEKREFKPPKNE--- 192 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKA 232 +N IS S LY T + I P+I F+H + S D+ADI K + Sbjct: 193 -MNSLISYGNSLLYTTTLSEIYHTYLHPSISFLHEPRERRFSLACDLADIFKPLIISRTI 251 Query: 233 FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 F++ N ++ + ++ + K K I + L P Sbjct: 252 FKLVNTNIIN-EKHFKKDV-GVYLNEKGRQKFIQEYNNKLKTTIKHPQ 297 >UniRef50_Q2RY11 CRISPR-associated protein, Cas1 family / CRISPR-associated exonuclease, Cas4 family n=2 Tax=Rhodospirillum rubrum ATCC 11170 RepID=Q2RY11_RHORT Length = 580 Score = 65.4 bits (158), Expect = 2e-09, Method: Composition-based stats. Identities = 40/241 (16%), Positives = 75/241 (31%), Gaps = 42/241 (17%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 +I D V+ + + + ++ ++L ++ A+ + + W+ Sbjct: 255 ARIGKKDYTLVIQVEGEADRSLALDEISEVVLAGPVSLTTPAIHELLRREIPVAWMSSGF 314 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE--------------------- 121 + ++G G + Q LA DE R R + Sbjct: 315 WFLGSTGGQGPRSAAVRTAQYALAGDERRRQAFARDLVSAKIRNGRTLLRRNWRGAEAER 374 Query: 122 -------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--------GVTWNGRRYD 166 R E A + L GIEG + + + + R Sbjct: 375 QIALDRLARLAERATTAETTACLLGIEGEAAAVYFRAFPQLFTQAVTTLPAFAFERRNRR 434 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 P D +N C+S + L +A+ AG P GF HT +P + D+ + + Sbjct: 435 PP----ADPVNACLSLCYAVLTRTLSSALSIAGLDPWKGFYHTERPGRPALALDLIESFR 490 Query: 225 F 225 Sbjct: 491 P 491 >UniRef50_A8ABE8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccus hospitalis KIN4/I RepID=A8ABE8_IGNH4 Length = 310 Score = 65.4 bits (158), Expect = 2e-09, Method: Composition-based stats. Identities = 24/110 (21%), Positives = 42/110 (38%), Gaps = 11/110 (10%) Query: 145 VRATYALLAKQYGVTWNG-RRYDPKD----WEKGDTINQCISAATSCLYGVTEAAILAAG 199 V+ A + + W ++ P D N+ + + LY V A++ AG Sbjct: 153 VKKLEAEWSSKL---WKDIVQFVPGMRSRVPRGNDPPNRTLDYLYALLYSVCNHALVGAG 209 Query: 200 YAPAIGFVHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREV 247 P G +H + LSFVYD +++ K + R E + + Sbjct: 210 LDPYAGLIHRERAGKLSFVYDFSEMFKP-MAIYVMATAIRTYKIELEGDF 258 >UniRef50_C8WTR3 CRISPR-associated protein Cas1 n=2 Tax=Alicyclobacillus acidocaldarius RepID=C8WTR3_ALIAD Length = 346 Score = 65.4 bits (158), Expect = 3e-09, Method: Composition-based stats. Identities = 48/270 (17%), Positives = 92/270 (34%), Gaps = 49/270 (18%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTL 74 +F+Q + V V+ + +P+ V I+ G + + A G Sbjct: 9 LFVQREGAIVRVHQDTVVVTLENETLLRVPMHMVDSIV-GIGRVSFTSPLLERCAAEGRS 67 Query: 75 LVWVGEAGVRVY-----------------ASGQPGGARSDKLLYQA--KLALDEDLRLKV 115 +V + G +Y + + + K+ L LK Sbjct: 68 VVRMTRGGRFLYRIEGRMSGNVLLRTAQHEAARSPERSLTIMRAIVAGKVHNQRQLVLKA 127 Query: 116 VRKMFE---LRFGEP-----------APARRSVEQLRGIEGSRVRATYALL------AKQ 155 R + F P+ +++RG+EG+ R + L A + Sbjct: 128 ARDLTAPADRSFVREVAGDLGRELRKLPSASHPDEIRGVEGASARRYFMALRHLIAPAIR 187 Query: 156 YGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL-- 213 ++++GR P D +N +S + + E+A+L G P IGF+HT +P Sbjct: 188 DALSFDGRNRRPPR----DPVNAVLSFLYALITRDAESALLGVGLDPQIGFLHTLRPGRP 243 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 S D+ + ++ + R +P Sbjct: 244 SLALDLVEEMRPILADRVMLSLFNRRQLQP 273 >UniRef50_Q74N45 NEQ017 n=1 Tax=Nanoarchaeum equitans RepID=Q74N45_NANEQ Length = 333 Score = 65.0 bits (157), Expect = 3e-09, Method: Composition-based stats. Identities = 50/264 (18%), Positives = 85/264 (32%), Gaps = 47/264 (17%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++ L G++ I+K ++ IP+ S+ I + V++ A++L A + Sbjct: 3 TIYILSIGKLYRGKNGLYFINKDKKKSPIPLESIKEIFILNKVSVTYNALKLLADRNIPI 62 Query: 76 VWVGE---AGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVV------------- 116 + E G+ Y L+ Q + D + R ++ Sbjct: 63 HFFYENTKKGISYYLGSFLPRQKTKSGLVLVKQVEAYKDIEKRTEIALEIVDAIRYNCIK 122 Query: 117 ------------------RKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGV 158 KMFE + A + +RGIE + Y L K + Sbjct: 123 VLEKYHIDEVKELRKIDVWKMFEESLNDWKDA---INIIRGIESNIWNLFYQGLDKILKL 179 Query: 159 -TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSF 215 R P E N +S A + LYGVT I P I F+H S Sbjct: 180 YKLERRTRRPPKNEA----NTIVSFANTLLYGVTLTEIYKTHLDPTISFLHELRDTRYSL 235 Query: 216 VYDIADIIKFDTVVPKAFEIARRN 239 D+++ K + + Sbjct: 236 ALDLSENFKPIITFRILIWLVNQG 259 >UniRef50_B3PMY9 Putative uncharacterized protein n=1 Tax=Mycoplasma arthritidis 158L3-1 RepID=B3PMY9_MYCA5 Length = 297 Score = 64.6 bits (156), Expect = 4e-09, Method: Composition-based stats. Identities = 51/280 (18%), Positives = 96/280 (34%), Gaps = 28/280 (10%) Query: 33 VLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGEAGVR-VYASGQ 90 ++++K G + IP + ++ E ++ + G ++ G + Sbjct: 21 LVVNKDGTKITIPTSQIDTVLFENDKLTITLPLINDLVDHGINIIVCGRNHMPKAQIIPF 80 Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG-----IEGSR- 144 G + L Q K D D + +V R++ +L+ + + LR +E Sbjct: 81 QGYYNAKILQKQIKW--DNDFKERVWRRIIKLKIRQTMVMLEHLAILRDDDKVKLEDYAN 138 Query: 145 ------VRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAA 198 + AK G ++ K DT N ++ S L AI + Sbjct: 139 SVKAFDITNCEGHCAKLNFKILFGEKFVRKTNTMEDTYNAYLNYGYSVLLSYVARAICSK 198 Query: 199 GYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPD-REVRLACRDIF 255 GY +G H D+ + + V ++ R + GEP+ +E + IF Sbjct: 199 GYDNRLGIFHRSYSNFYPLACDLMEPFRC-IVESLVYKHVRISRGEPNFQEFKEELFTIF 257 Query: 256 RSS-------KTLAKLIP-LIEDVLAAGEIQPPAPPEDAQ 287 L I L+ VL+ +I+ P A+ Sbjct: 258 YKHFNCQGENLMLIDCIDKLVVLVLSDHDIKGPYFNWSAE 297 >UniRef50_A0LWB2 CRISPR-associated protein Cas1 n=1 Tax=Acidothermus cellulolyticus 11B RepID=A0LWB2_ACIC1 Length = 295 Score = 64.6 bits (156), Expect = 4e-09, Method: Composition-based stats. Identities = 48/255 (18%), Positives = 86/255 (33%), Gaps = 33/255 (12%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 G++ GA ++ D+ +P+ VA ++ P + + + AA G +V G Sbjct: 15 GEVHAAQGALLVGDER-----VPLVDVAMMLTGPYVSLHGSVIDRAAAFGVGVVHCDWRG 69 Query: 83 VRVYA--SGQPGGARSDKLLYQAKL--------ALD--EDLRLKVVRKMFELRFGEPAPA 130 V V A + + QA+L ++ + + LR A Sbjct: 70 VPVAATLPWSTHNRVAARHRAQAELSLPRQKNAWMNIVKTKIRNQAAVLRALRRDGVAQL 129 Query: 131 RRSVEQLRG-----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 R Q+R EG+ R +A L + + + D +N + + Sbjct: 130 ERLAAQVRSGDASNAEGAAARVYWARL-------FQDKHFRRV-PRARDVVNGLLDYGYA 181 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEP 243 L G A++ AG AP++G H F D+ + + V EI Sbjct: 182 ILRGCCLRAVVGAGLAPSLGLWHRRHDNPFTLVDDLIEPFRPA-VDKTVIEIVTAGASGL 240 Query: 244 DREVRLACRDIFRSS 258 DR + + Sbjct: 241 DRPTKRLLVAVLDHQ 255 >UniRef50_Q0ADY5 CRISPR-associated protein, Cas1 family n=2 Tax=Nitrosomonas RepID=Q0ADY5_NITEC Length = 328 Score = 64.6 bits (156), Expect = 4e-09, Method: Composition-based stats. Identities = 38/228 (16%), Positives = 77/228 (33%), Gaps = 41/228 (17%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +F+ +++ GA V + +P+ + + L ++ A + + G + Sbjct: 4 LFVDRRGVVLELESGAIVFRENGERIGTVPIAPLTRVFLRGDVKLPAALLGKLGEQGVGV 63 Query: 76 V-WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP----- 129 V G G +P + + + Q +L+ D+ L++ + + E + Sbjct: 64 VILSGRIGRPSLLLARP-HNDAARRVVQIRLSFDKPFCLQIAKALIERKLTRQIEWFAEL 122 Query: 130 --------------------------ARRSVEQLRGIEGSRVRATYALLAKQY--GVTWN 161 S LRG+EGS ++ L + ++ Sbjct: 123 RENDMQVRYELSHALRALEEHRSRIGHVSSAASLRGVEGSAAARYFSGLQAVVPDSLHFS 182 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 GR P D N +S + L+ A+ G+ P +GF H Sbjct: 183 GRNRRPPR----DPFNALLSLTYTLLHSEIAIALYGTGFDPYVGFYHR 226 >UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfurococcus kamchatkensis 1221n RepID=B8D4S7_DESK1 Length = 328 Score = 64.6 bits (156), Expect = 4e-09, Method: Composition-based stats. Identities = 31/161 (19%), Positives = 61/161 (37%), Gaps = 27/161 (16%) Query: 118 KMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLA----KQYGVTWNGRRYDPKDWEKG 173 +M E F E E+LR +E R + L+ K+ G ++ +D + Sbjct: 146 RMLEAEFEEA------REKLRQLEAEAARIYWPSLSILIPKELG-------FNSRDQDSE 192 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPK 231 D +N ++ A LYG + ++ AG P GF+HT + +D+ ++ + Sbjct: 193 DLVNTSLNYAYGILYGESWKVLVLAGLDPYAGFMHTDRSGKPVLAFDLIEMFR------- 245 Query: 232 AFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 F + + ++ A+++ I L Sbjct: 246 -FTADSTLLAMYRHGWKPRVLNGLLDYESRARIVESIMKTL 285 >UniRef50_C8PKY6 Putative CRISPR-associated protein Cas1 n=1 Tax=Campylobacter gracilis RM3268 RepID=C8PKY6_9PROT Length = 731 Score = 64.3 bits (155), Expect = 5e-09, Method: Composition-based stats. Identities = 42/249 (16%), Positives = 88/249 (35%), Gaps = 38/249 (15%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + G VL K I+ P+ ++ I++ +S A ++ A+ + ++ E Sbjct: 421 LALSQGKLVLKSKGAIKHKFPINQISQIIINAQISLSSAVIKECAKKKISINFIDEKTNL 480 Query: 85 VYASGQPGGAR-SDKLLYQAKLALDEDLRLKVVRKM-----------------FELRFGE 126 YA+ + Q L + L++ ++ + G Sbjct: 481 SYATLISANSAIPKTAASQISLLTTKKS-LRIAQQFIIGKLKNQINYLKYLGKYHKNLGA 539 Query: 127 PA-----------PARRSVEQLRGIEGSRVRATYALLAKQ--YGVTWNGRRYDPKDWEKG 173 P SV +L G EGS + + +AK Y ++ R Sbjct: 540 EIKAMQEILKLRVPGAASVSELMGFEGSAANSYWQAIAKAVDYEFGFSARVTQ----GAT 595 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPK 231 D +N ++ + LY +I AAG +P + ++H + + +D+ + + V Sbjct: 596 DIVNSALNYGYAILYSKILKSIAAAGLSPHVSYLHALDEQKPTLAFDLIEEFRAFIVDRA 655 Query: 232 AFEIARRNP 240 + +N Sbjct: 656 VISMVNKNE 664 >UniRef50_A3ZPG0 Putative uncharacterized protein n=1 Tax=Blastopirellula marina DSM 3645 RepID=A3ZPG0_9PLAN Length = 331 Score = 64.3 bits (155), Expect = 5e-09, Method: Composition-based stats. Identities = 48/277 (17%), Positives = 94/277 (33%), Gaps = 37/277 (13%) Query: 35 IDKTGIRTH-IPVGSVACIMLE-PGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPG 92 + G H IP + ++++ PGT +HAA+R A +V G + A+ Sbjct: 59 FKREGAVVHTIPCEEIGVVVIDHPGTTYTHAALRQLAASDAAVVICGANHLP--AAILLP 116 Query: 93 GARSD----KLLYQA--------KLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGI 140 A +L+ Q +L ++ + P R+ ++ L + Sbjct: 117 LADHSEVVWRLVEQIEAPKPLCKRLW---RQIVRAKIDAQASVLPDDCPGRQKLKSLIPL 173 Query: 141 E--GSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAA 198 G A AK Y W +D + D +N ++ + AI++A Sbjct: 174 VKPGDAA-NVEARAAKTYWQFWYPEAGFRRDADS-DGVNALLNYGYAIARAAIARAIVSA 231 Query: 199 GYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFR 256 G PA+G H + F D+ + + V + E+ R+ +E + + Sbjct: 232 GLTPALGLHHRNRSNPFCLADDLIEPFRP-MVDERVRELHRQGYDSLTQECKGELLKLLA 290 Query: 257 SSKTLA-----------KLIPLIEDVLAAGEIQPPAP 282 +L +++ + L + P Sbjct: 291 RQTSLDGEKGPFMVQLHRMLGSLVRCLRKKQTSLEIP 327 >UniRef50_D2R8Z2 CRISPR-associated protein Cas1 n=1 Tax=Pirellula staleyi DSM 6068 RepID=D2R8Z2_9PLAN Length = 942 Score = 64.3 bits (155), Expect = 6e-09, Method: Composition-based stats. Identities = 52/262 (19%), Positives = 96/262 (36%), Gaps = 40/262 (15%) Query: 20 LQYGQIDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWV 78 ++ + F D+ GI + IP+ ++ ++L +SH + Q L+ + Sbjct: 622 IEEILVHGDVLQFKHQDQRGIES-IPLHTLREVVVLG-AVSLSHRVLSAIQQNAISLLLL 679 Query: 79 GEAGVRV-YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------FE--------- 121 E+ R Y Q S + Q L + + L + R++ + Sbjct: 680 DESANRTAYIDCQNAEPDSAGIEAQVDLIRNPESSLAIARQLISAKLHNYATLADAYPPK 739 Query: 122 ----------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKD 169 +R + A S+ +L GIEGS Y + + G W R Sbjct: 740 SHSGNAHRSLMRLAKDAQQASSLPELLGIEGSGAALWYGEIGMRLSPGFHWERR----VA 795 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 D +N ++ A + LY +T+ AI A +GF+H + + D+ + + Sbjct: 796 PNAHDPVNILLNLAQTVLYRMTQHAIAQAKLVDTLGFLHQPRAGHAALASDMQEPFR-HL 854 Query: 228 VVPKAFEIARR-NPGEPDREVR 248 + E RR P E + + R Sbjct: 855 MDRAVLETVRRIRPEEFEPDER 876 >UniRef50_B5IAF4 CRISPR-associated protein Cas1 n=3 Tax=Euryarchaeota RepID=B5IAF4_9EURY Length = 404 Score = 64.3 bits (155), Expect = 6e-09, Method: Composition-based stats. Identities = 35/213 (16%), Positives = 67/213 (31%), Gaps = 49/213 (23%) Query: 59 RVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK 118 +S A+R + + + G + + Q ++ ++E RLK+ K Sbjct: 50 NISFEAIRWLMKHNITVSVLNWNGNLLSVFLPKEPINGKLKIRQYEIYINEKERLKIAEK 109 Query: 119 ------------MFE--------------------------LRFGEPAPARRSVEQLRGI 140 ++E + E + S L Sbjct: 110 ILEEKIRKSENMLYELSEYYPEIEHIKVKKRIEKEEKLKRDMELKEENKPKLSY--LLMY 167 Query: 141 EGSRVRATYALLAKQYG-----VTWNGRRYDPKDWE--KGDTINQCISAATSCLYGVTEA 193 EG + + L+K + + R W D IN ++ + + L + Sbjct: 168 EGRVAQIYWKELSKIFNKLYPEFNFTSRSTKSYSWNMNASDEINALLNYSYALLESMIRK 227 Query: 194 AILAAGYAPAIGFVHT--GKPLSFVYDIADIIK 224 I A G P+IGF+H VYD+ ++ + Sbjct: 228 HINAVGLDPSIGFLHELASSKTPLVYDLQELFR 260 >UniRef50_C9RRG3 CRISPR-associated protein Cas1 n=1 Tax=Fibrobacter succinogenes subsp. succinogenes S85 RepID=C9RRG3_FIBSS Length = 343 Score = 63.9 bits (154), Expect = 6e-09, Method: Composition-based stats. Identities = 51/320 (15%), Positives = 98/320 (30%), Gaps = 45/320 (14%) Query: 18 IFL-QYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ + G ++ G V+ I + V + L + A+ G + Sbjct: 3 LYVTEQGTRLGKNGGHLVVQRDGCTIDDILLSEVDSLSLFGAVHPTTDAMLALLDKGADI 62 Query: 76 VWVGEAG-------------VRVYASGQPGGARSDKLLYQAK--LALDEDLRLKVVRKMF 120 ++ G V + D+ AK + + L+V+ Sbjct: 63 AFLSSGGRYRGRLVSAVGKNVPLRLCQYDVFRDDDRAFALAKSCVVRKLENGLRVLEAYS 122 Query: 121 E-----LRFGEPAP-----------ARRSVEQLRGIEGSRVRATYALLAK--QYGVTWNG 162 + RF ++LRG EG+ R + + G+ + G Sbjct: 123 KNPHNSFRFENRDEYLRNLNAVRRLQGFDRDELRGFEGNGARIYFENFGRCLACGLDFPG 182 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIA 220 R+Y P D +N +S + E+ + + G P +G++H S D+ Sbjct: 183 RKYHP----STDPVNALLSFGYTLTARSLESLLESYGMDPMLGYLHEPSYGRNSLAQDML 238 Query: 221 DIIKFDTVVPKA-FEIARRNPGEPDREVRLACR---DIFRSSKTLAKLIPLIEDVLAAGE 276 + + V F RR D E R +F + + + ED +A Sbjct: 239 EEFRHPLVDRLVLFLFNRRVLVADDFEQRNDENSSGQLFLKPEKMRVFLHHYEDFVARPN 298 Query: 277 IQPPAPPEDAQPVAIPLPVS 296 + L V Sbjct: 299 GIYQGLANLPWRSVMRLRVE 318 >UniRef50_Q2J7N9 CRISPR-associated protein Cas1 n=1 Tax=Frankia sp. CcI3 RepID=Q2J7N9_FRASC Length = 344 Score = 63.5 bits (153), Expect = 8e-09, Method: Composition-based stats. Identities = 45/253 (17%), Positives = 77/253 (30%), Gaps = 47/253 (18%) Query: 25 IDVIDGAFVLI--DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + + A + D R +P+ V I++ G ++ ++ A + W+ G Sbjct: 17 LHLDGDAVRIWHPDNDKGRRLLPLVRVDHIVVFGGVTITDDLLQRCATDRRSVTWLTGNG 76 Query: 83 VRVYASGQPGG--------------ARSDKLLYQA------KLALDEDLRLKVVR----- 117 R A + ++ L A KL L L+ R Sbjct: 77 -RFRARVEGPTGGNPHLRIAQHDHFRDDERRLTLAMSYIAGKLQNSRQLLLRAARDATGT 135 Query: 118 KMFELR-----FGEPAPA---RRSVEQLRGIEGSRVRATYALLAK-----QYGVTWNGRR 164 + LR + P +V + G+EG R A GR Sbjct: 136 RQTALRDTAAHLADALPTLRDTTNVAEAMGVEGQAARRYIATWPHLLTPHATVTAPAGRT 195 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 P D +N +S L A+ G P IG++H +P + D+ + Sbjct: 196 SRPA----TDPVNAALSFGYGILRIAVHGALDHVGLDPHIGYLHGIRPGKPALALDLMEE 251 Query: 223 IKFDTVVPKAFEI 235 + V F Sbjct: 252 FRALLVDRLVFTA 264 >UniRef50_Q1J1U7 CRISPR-associated protein Cas1 n=1 Tax=Deinococcus geothermalis DSM 11300 RepID=Q1J1U7_DEIGD Length = 342 Score = 63.5 bits (153), Expect = 8e-09, Method: Composition-based stats. Identities = 38/227 (16%), Positives = 76/227 (33%), Gaps = 22/227 (9%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 ++ + + G M G + A A + + Sbjct: 59 RLAREHKPVTWLSEHGRFM----ARTETPM--SGNVLLRTAQHACAGNAARTLAI----A 108 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGS 143 R+ A+G+ + L + D+ L+ + ++ P +V+++RG EG+ Sbjct: 109 RLIAAGKLQNQKVTLLRAAREAEADDAALLRQAARDINVQIA-CLPLTETVDEVRGTEGT 167 Query: 144 RVRATYA----LLAKQYGVTW-NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAA 198 R + +L + W + R P D IN ++ + L +A A Sbjct: 168 AARLYWEVFPLMLRQNRDFFWLSERHRRPAR----DPINALLNFVYTVLANDCASACQAV 223 Query: 199 GYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 G P +GF+H +P S D+ + ++ + R P Sbjct: 224 GLDPQLGFLHALRPGRSSLALDLMEELRPVIADRAILTLINRQQLTP 270 >UniRef50_D0MKP4 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MKP4_RHOM4 Length = 525 Score = 63.5 bits (153), Expect = 9e-09, Method: Composition-based stats. Identities = 45/286 (15%), Positives = 90/286 (31%), Gaps = 56/286 (19%) Query: 8 PIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIR----THIPVGSVA-CIMLEPGTRV 60 +P + +++ + V+ R +P V +++ P ++ Sbjct: 178 IMPPVRQARTLYVDEIGAVVRRKGRQLVVTVSRDGRRQELLRVPALLVDQVVLVGP-VQI 236 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGAR-SDKLLYQAKLALDEDLRLKVVRKM 119 + A+R+ + +V++ G R L Q + D + L + R Sbjct: 237 TSQALRMLLRRNVDIVYLSGEG-RFEGRLAAEFHPHVALRLAQYEAFRDPERTLTLARLF 295 Query: 120 FE----------LRFGEP-------------------APARRSVEQLRGIEGSRVRATYA 150 R+ + ++E LRG+EG+ R ++ Sbjct: 296 VRGKLQNMAGLLRRYADEYGSASLRAAASEINRDLERLEQVTTLEALRGVEGTASRRYFS 355 Query: 151 LLAKQYGVT---------WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYA 201 + + + GR P D +N + + L G AA AG Sbjct: 356 VFGEMLRAEAYAPTGWPAFPGRHRRPP----TDPVNATLGYLYALLLGNVVAACALAGLD 411 Query: 202 PAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 P +G++H G+P S D+ + + A + R P Sbjct: 412 PYVGYLHAPAYGRP-SLALDLMEEFRAPAADRLALRLFNRGRLRPQ 456 >UniRef50_B2RM83 CRISPR-associated protein Cas1 n=3 Tax=Bacteroidetes RepID=B2RM83_PORG3 Length = 338 Score = 63.1 bits (152), Expect = 1e-08, Method: Composition-based stats. Identities = 46/285 (16%), Positives = 83/285 (29%), Gaps = 51/285 (17%) Query: 16 SMIFLQYGQIDVIDGAFVLI---------DKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 + G++ D + ++ G +IPV ++ + R + + Sbjct: 4 TYYLFNPGELSRKDNTIRFVPIQEGENGQEQAGQARYIPVEGISDFYVFGSLRANSSLYN 63 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKM---- 119 + + Y LL QA ++ RL + RK Sbjct: 64 FLGSNDIAVHFF--DYYENYTGSFMPRDFLLSGKMLLAQASAYKNKKKRLFLARKFIEGA 121 Query: 120 ----------FELRFGEPAP-------------ARRSVEQLRGIEGSRVRATYALLAKQY 156 + R + P ++E L GIEG+ +R Y Sbjct: 122 ASNMQKNLAYYNNRGKDMQPMMELIDKYSLRLEETTTIEALMGIEGN-IRQAYYDAFNLI 180 Query: 157 GVTWN--GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL- 213 + R P E +N IS Y + AI + P I F+HT Sbjct: 181 IDPFEMGARSKQPPQNE----VNALISFGNMMCYTLCLKAIHQSQLNPTISFLHTPGERR 236 Query: 214 -SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 S DI+++ K V F++ + + + + Sbjct: 237 YSLCLDISEVFKPILVDRTIFKVMNKRIIQA-KHFDKQLNKCILN 280 >UniRef50_B1YCK7 CRISPR-associated protein Cas1 n=2 Tax=Thermoproteaceae RepID=B1YCK7_THENV Length = 303 Score = 63.1 bits (152), Expect = 1e-08, Method: Composition-based stats. Identities = 47/269 (17%), Positives = 88/269 (32%), Gaps = 44/269 (16%) Query: 31 AFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEAG---VRVY 86 A V+ + G IP+ V + +L G +S VR A+ +V+ G R++ Sbjct: 18 ALVVRRRGGAAERIPIHQVDRLWILTGGVSISSRLVRALARSFVDVVFFDGRGNPAARLF 77 Query: 87 ASGQPGGARSDKLLYQAKL---------------ALDEDLRLKVV----RKMFELRFGEP 127 G + Y+A L +++ L+ R++++ G Sbjct: 78 PPEANGTVAHRRAQYEAYLNGRGLELAKLVVYGKIVNQAAALRRAGLWRRELYQELAGAA 137 Query: 128 ---------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 P + + G EG +A L+K +G +D D N Sbjct: 138 SRVAEAAAAVPRCGDPQCVLGHEGRAAAEYWAALSKAFGTP-------TRDPNASDPFNL 190 Query: 179 CISAATSCL-YGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEI 235 ++ L Y V A++ G P G++H K S V D+ + + + + Sbjct: 191 ALNYGYGILRYAVWRQAVIH-GLDPYAGYLHVDKSGRPSLVLDLMEEFRPHIDL-MVLKA 248 Query: 236 ARRNPGEPDREVRLACRDIFRSSKTLAKL 264 ++ R +L Sbjct: 249 KPSADWLEGGVLKREARAALVEKWLEMRL 277 >UniRef50_C0QR16 Crispr-associated protein Cas1 n=23 Tax=Bacteria RepID=C0QR16_PERMH Length = 331 Score = 63.1 bits (152), Expect = 1e-08, Method: Composition-based stats. Identities = 43/284 (15%), Positives = 86/284 (30%), Gaps = 38/284 (13%) Query: 23 GQIDVIDGAFVLIDKTGIRT---HIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVG 79 G++ + + +P+ + I + ++ A+ + + + Sbjct: 11 GRLRRKENTLYFETEQNGEIVKKALPINDIDVIYVFGELDINTKALNYLSGYDIPIHFYN 70 Query: 80 EAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVV------RKMFELR------- 123 G Y+ L+ Q + +D R + LR Sbjct: 71 YYG--FYSGSFLPRKKNVSGSLLVEQVRHYIDNKKRQYLAVSFVEGAAYHILRNLRKQNI 128 Query: 124 ------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWE 171 F + R + E+L +EG+ +R Y L + + + Sbjct: 129 NSEDFEEIEKDLFPKIFETRNT-EELMALEGN-IRERYYQLFNKIINNEDFFMEKREKRP 186 Query: 172 KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVV 229 + IN IS S +Y I P I ++HT K S D+A+I K + Sbjct: 187 PDNPINALISFGNSIMYNTVLTEIYRTQLDPTISYLHTPQEKRFSLSLDLAEIFKPFIID 246 Query: 230 PKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 P F + ++ + + K + E+ L+ Sbjct: 247 PLIFSLINNRQITI-KDFDKDLNYAYLNENGRKKFLKAYEERLS 289 >UniRef50_O28401 Putative uncharacterized protein n=1 Tax=Archaeoglobus fulgidus RepID=O28401_ARCFU Length = 345 Score = 62.3 bits (150), Expect = 2e-08, Method: Composition-based stats. Identities = 27/120 (22%), Positives = 48/120 (40%), Gaps = 12/120 (10%) Query: 134 VEQLRGIEGSRVRATYALLA----KQYGVTWNGRRYDP--KDWEKGDTINQCISAATSCL 187 E+L GIEG + + ++ ++Y +NGRR D +N ++ S L Sbjct: 161 RERLLGIEGKASKHYWDAISLVIPEEYR--FNGRRGIEIGSPRYAKDIVNAMLNYGYSIL 218 Query: 188 YGVTEAAILAAGYAPAIGFVH---TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 A+ AG P GF+H +G+ S D+ + + V + +P+ Sbjct: 219 LAECVKAVELAGLDPYAGFLHVDVSGRS-SLAIDLMENFRQQVVDRVVLRLISYRQIKPE 277 >UniRef50_B4ATI8 Crispr-associated protein Cas1 n=5 Tax=Proteobacteria RepID=B4ATI8_FRANO Length = 318 Score = 62.3 bits (150), Expect = 2e-08, Method: Composition-based stats. Identities = 22/116 (18%), Positives = 47/116 (40%), Gaps = 4/116 (3%) Query: 122 LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCIS 181 R + +++ + G EG ++ LAK ++ + + + + N ++ Sbjct: 138 KRVAQLIKNAKTLNDVLGYEGYAANIYFSSLAKDKFLSASF--ANREGRGSQEIANSMLN 195 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTGKP--LSFVYDIADIIKFDTVVPKAFEI 235 + L AI AG P +GF+H +P +S V D+ + + V ++ Sbjct: 196 FGYAILSSYILNAITNAGLEPYLGFLHQKRPGKMSLVLDLMEEYRAWVVDRVVIKL 251 >UniRef50_O66692 Putative uncharacterized protein n=2 Tax=Aquificaceae RepID=O66692_AQUAE Length = 316 Score = 61.9 bits (149), Expect = 2e-08, Method: Composition-based stats. Identities = 38/248 (15%), Positives = 74/248 (29%), Gaps = 42/248 (16%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 +G + + + ++ IPV V I + LL ++ Sbjct: 10 HGTLSRHENTLRFENAE-VKKDIPVEDVEEIF----------VFAELSLNTKLLNFLASK 58 Query: 82 GVRVY-----------ASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM---------FE 121 G+ ++ + L+ Q + LD RL + + + Sbjct: 59 GIPLHFFNYYGYYTGTFYPRESSVSGHLLIKQVEHYLDAQKRLYLAKSFVIGSILNLEYV 118 Query: 122 LRFGEPA-----PARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 + S+ +L +E + Y L + G R P + + Sbjct: 119 YKISADTYLNKVKETNSIPELMSVEAEFRKLCYKKLEEVTGWELEKRTKRPPQ----NPL 174 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADIIKFDTVVPKAFE 234 N IS S Y I P + ++H + K S D+A++ K V Sbjct: 175 NALISFGNSLTYAKVLGEIYKTQLNPTVSYLHEPSTKRFSLSLDVAEVFKPIFVDNLIIR 234 Query: 235 IARRNPGE 242 + + N + Sbjct: 235 LIQENKID 242 >UniRef50_C4FMU2 Putative uncharacterized protein n=1 Tax=Veillonella dispar ATCC 17748 RepID=C4FMU2_9FIRM Length = 331 Score = 61.9 bits (149), Expect = 2e-08, Method: Composition-based stats. Identities = 43/266 (16%), Positives = 81/266 (30%), Gaps = 42/266 (15%) Query: 15 VSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ I G V+ + +P+ + I + ++ + V + G Sbjct: 1 MSSLYVTEAGSFIKRDGGHVVVGRNNEVLFEVPLERIEDITVFDSVSITSSLVTDFIERG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM------------- 119 + W+ G +K Q L D R+ + RK+ Sbjct: 61 VPITWLSGYGKYFGTLINTNTIDINKHKKQFDLLDDNAFRVAMSRKIIRAKVRNQLTILR 120 Query: 120 -FELRFGEPAP----------------ARRSVEQLRGIEGSRVRATYALLAK--QYGVTW 160 + E V +L G EG R + L K + Sbjct: 121 RYARNLEEDINIDAQIANIKSVRSHIGECMRVSELMGYEGLISRLYFEALGKIVPSAFAF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVY 217 R P D N + S L+ A ++ AG P +G +H+ G P + Sbjct: 181 TKRTKQPPR----DPFNAMLGLGYSMLFNEILAGVINAGLHPFVGIMHSLAKGHP-ALAS 235 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEP 243 D+ + + + + RN + Sbjct: 236 DLIEEWRAPIIDSMVLSMVSRNMVDL 261 >UniRef50_Q1Q3I6 Putative uncharacterized protein n=1 Tax=Candidatus Kuenenia stuttgartiensis RepID=Q1Q3I6_9BACT Length = 328 Score = 61.9 bits (149), Expect = 3e-08, Method: Composition-based stats. Identities = 33/251 (13%), Positives = 75/251 (29%), Gaps = 39/251 (15%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGE 80 Q + ++ + + + +++ + + AV + G + + Sbjct: 4 QNSILRKSGDRLIIEKDDKVLLEVQCHKIDAVLIFGNVQFTTQAVHELFEHGIEMAILTR 63 Query: 81 AGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR--------------KMFELRFG- 125 G + P + Q K ++D RL + + F Sbjct: 64 TGKLIGQITSPYTKNITLRVQQFKQYWNDDFRLAFAKVIVCGKIQNCIQLVRSFSYNHPR 123 Query: 126 --------------EPAPARRSVEQLRGIEGSRVRATYALLAKQ--YGVTWNGRRYDPKD 169 + ++ QL GIEG+ R + K + GR+ P Sbjct: 124 NSFDVEMDDLSLRLNEVESAANISQLFGIEGNAARVYFTSFGKMILSAFAFPGRKKYP-- 181 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFD 226 D +N +S + ++ + + G+ P +G+ H G+ S D+ + + Sbjct: 182 --STDPVNALLSLNYTMIFNEISSLLDGLGFDPYLGYYHGIDYGRS-SLASDLMEEFRAP 238 Query: 227 TVVPKAFEIAR 237 + Sbjct: 239 IADRITLNLIN 249 >UniRef50_A2SRR7 CRISPR-associated protein, Cas1 family n=24 Tax=cellular organisms RepID=A2SRR7_METLZ Length = 344 Score = 61.9 bits (149), Expect = 3e-08, Method: Composition-based stats. Identities = 42/327 (12%), Positives = 97/327 (29%), Gaps = 72/327 (22%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLE-----PGTRVSH 62 ++ +++++ + + V+ + +P+ ++ + PG Sbjct: 1 MRKLRNVLYVTNPKSYLSRDGESIVVSVENQELARVPIRNLEGVVCFGYMGASPGM---- 56 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----- 117 + L + + +V G + G Q ++ DE+ K+ Sbjct: 57 --MALCTENDVGMCFVSPYGKFMARIGGNVSGNVLLRKRQYAVSDDEEASKKIAAYCILG 114 Query: 118 -----KMFELRFGEPAPARRSVE------------------------QLRGIEGSRVRAT 148 + RF P S E +LRG EG + Sbjct: 115 KLMNCRTVLQRFSRDYPDMVSREFEQNFKRLSEGILQIRAGTCGSLNELRGFEGILSKYY 174 Query: 149 YAL-----LAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPA 203 + L+ + ++ R P D +N +S + + + +A+ + G P Sbjct: 175 FHSFNDLILSTEPEFSFENRSRRPP----LDRVNALLSFSYTLIAADCASALESVGLDPQ 230 Query: 204 IGFVHTGKPL--SFVYDIADIIKFDTVVPKA-------------FEIARRNPGEPDREVR 248 +GF+H +P S D+ + + F + + R Sbjct: 231 VGFLHRVRPGRPSLALDLMEEFRPYLGDRFVLSLINNRVVKADDFAVKENGAVLLTDDAR 290 Query: 249 LACRDIFRSSKTLAKLIPLIEDVLAAG 275 ++ K + +E+ + G Sbjct: 291 KTVLQAWQKRKKEEVMHGYLEEKMPVG 317 >UniRef50_Q2FQQ2 CRISPR-associated protein Cas1 n=2 Tax=Methanomicrobia RepID=Q2FQQ2_METHJ Length = 347 Score = 61.6 bits (148), Expect = 3e-08, Method: Composition-based stats. Identities = 49/291 (16%), Positives = 93/291 (31%), Gaps = 30/291 (10%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTH-IPVGSVACIMLEPGTRVSHAAVRLA 68 L+D I YG+I + G + D G P+ V + + VS ++ Sbjct: 25 LLEDDTIYITTPYGKISLDGGRIQVKDSDGEIVASFPLEKVCTMNVFGSASVSTPLLKHC 84 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQ------------------AKL----A 106 + ++ + G + + S P + AKL Sbjct: 85 SDKEVVINYFTNFG-KYFGSFVPSRNTIALVRRHQAGITKEKSLAICREIIHAKLQNSCV 143 Query: 107 LDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 R++V + ELR + + SV+ LRGIEG + +L+ W R Sbjct: 144 FLARKRVEVPSLLKELR--DRSLHAVSVDSLRGIEGEAASIYFPMLSSSLPDEW--RSDK 199 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIK 224 D +N +S + + +A+ P IG +H + + D+ + + Sbjct: 200 RTRRPPRDELNALLSLTYTMVNTEVISALRQYNLDPFIGVMHVDRHGKPALALDLLEEFR 259 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + + D + + + K L +E+ L Sbjct: 260 PVFCDAFVARLINKRMITKDDFTQGSRLNDTAFKKYLGFYHDFMEESLKHP 310 >UniRef50_B8G918 CRISPR-associated protein Cas1 n=5 Tax=Chloroflexi (class) RepID=B8G918_CHLAD Length = 339 Score = 61.6 bits (148), Expect = 3e-08, Method: Composition-based stats. Identities = 61/317 (19%), Positives = 107/317 (33%), Gaps = 52/317 (16%) Query: 18 IFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ Q +I + I +P+ + I++ +S A++ G + Sbjct: 4 LYVIEQGAEIGCDGERIEVRRGADIIGSVPLVKLDDIVIFGNVGISTPAMKRLLDRGIEV 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLA--LDEDLRLKVVRKMFE----------LR 123 ++ G Y G + L A+ A D L + ++ E R Sbjct: 64 TFMTVDG--RYQGRLIGQVTAHVALRHAQYACAADPARALALAQRFVEGKLRNQRALLQR 121 Query: 124 FG----EPAP-----------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWN- 161 F EP P + L G+EGS +A L G W+ Sbjct: 122 FSRNRAEPPPEAQAAADDLEAYIKRVKRTTQLSSLLGVEGSATARYFAGLRSLIGPEWSF 181 Query: 162 -GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVY 217 GR+ P D +N +S + L A+ AAG+ P +GF+H+ G+P S Sbjct: 182 SGRQRRPP----PDPVNLLLSLGYTLLAHKVLGAVQAAGFDPYLGFLHSLDYGRP-SLAL 236 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVR--LACRDIFRSSKTLAKLIPLIEDVLAAG 275 DI + + + I P+ R R I + + + E+ + Sbjct: 237 DIMEEFRPILIDSLVVRICNDGRIRPE-HFRPGEGERPIIITDEGKRAFLTAFEERMRTE 295 Query: 276 EIQPPAPPEDAQPVAIP 292 P D+ P +P Sbjct: 296 ATHPEGA--DSGPGKVP 310 >UniRef50_C5SD37 CRISPR-associated protein Cas1 n=1 Tax=Allochromatium vinosum DSM 180 RepID=C5SD37_CHRVI Length = 346 Score = 61.6 bits (148), Expect = 3e-08, Method: Composition-based stats. Identities = 58/273 (21%), Positives = 94/273 (34%), Gaps = 54/273 (19%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT---------- 73 Q+ + GA L+ HIP+G + +++ V R A+ G Sbjct: 15 QLTLDGGALRLVTPGTKPRHIPLGVLGLVVVHGRALVGCDVWRALAERGIPAVMQPGRGR 74 Query: 74 -LLVW----VGEAGVRVYASGQPGGARSDKL-----------LYQAKL------ALDEDL 111 + W +G GV A + +L L QA++ A D Sbjct: 75 GVCAWMGPALGATGVLRAAQHRAAEQVERRLMLARDLIAAKLLAQARVIERLPVASDPPE 134 Query: 112 RLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYDPKD 169 VR+ +L A + E + G+EGS A + L TW GR P Sbjct: 135 TRTAVRRQQDLALARLGQASSTTEVM-GLEGSAAAAWFRWLTLWLSPTWGFQGRNRRPPR 193 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIK-- 224 D +N +S + L G + I G PA G +H G+ S V D+ + ++ Sbjct: 194 ----DPVNALLSLGYTLLGGEMLSVIQQQGLDPARGLLHELVPGRE-SLVLDLIEPLRPS 248 Query: 225 ---------FDTVVPKAFEIARRNPGEPDREVR 248 + P+ F + + +E R Sbjct: 249 VDLVLLGMLDRLLTPEDFTTSPEDGCRLSKEAR 281 >UniRef50_Q3B3C1 CRISPR-associated protein, Cas1 family n=20 Tax=Bacteria RepID=Q3B3C1_PELLD Length = 343 Score = 61.2 bits (147), Expect = 5e-08, Method: Composition-based stats. Identities = 36/219 (16%), Positives = 71/219 (32%), Gaps = 46/219 (21%) Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLY----------QAKLALDE----- 109 + A G + ++ E G + Q + L Q+ + Sbjct: 57 MGHCAANGVTVTFLTEYG-KFLCQVQGPTKGNILLRRAQYRQADNYLQSAMLARSFVIGK 115 Query: 110 --DLRLKVVRKM---------FELRFGEPAPAR--------RSVEQLRGIEGSRVRATYA 150 + R+ + R + ++ + + A E++RGIEG R + Sbjct: 116 IGNSRVTLARALRDHPDKIDCEKMHYAQQLLAGCIKKLGNETDQERIRGIEGEAARIYFE 175 Query: 151 LL-----AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIG 205 + + +NGR P D +N +S + + +A+ + G PA G Sbjct: 176 VFDQCITSSDPLFCFNGRNRRPP----VDRVNCLLSFLYTLVTHDIRSALESCGLDPAAG 231 Query: 206 FVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGE 242 F+H +P S D+ + + A + R Sbjct: 232 FLHKDRPGRPSLALDMLEEFRSYIADRMALSLINRGQIH 270 >UniRef50_C0W0W5 CRISPR-associated Cas1 family protein n=1 Tax=Actinomyces coleocanis DSM 15436 RepID=C0W0W5_9ACTO Length = 287 Score = 61.2 bits (147), Expect = 5e-08, Method: Composition-based stats. Identities = 52/284 (18%), Positives = 97/284 (34%), Gaps = 30/284 (10%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 GQI GA + +P+ VA +++ S A+ G ++ G Sbjct: 3 GQISSARGAISIEPDGKEPVLVPISDVAVLLIGHRVVFSGGALHRCLSAGVAVMLCDWRG 62 Query: 83 VRVYAS--GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGI 140 V + + + + QA+L E R +++ + + A + LRG Sbjct: 63 VPEGGAFGWSDHTRVAARRIAQAQL--SEPRRKNAWKQIIKEKLRGQA-SALDDLGLRGG 119 Query: 141 EGSR-VRATYAL------LAKQYGVTWN-----GRRYDPKDWEKGDTINQCISAATSCLY 188 + R +R A+ W G P +N + A + + Sbjct: 120 DFLRELRKQVRSGDPANVEAQAAKFYWKALGGEGFNRVPGARFG---VNGMLDYAYAIVR 176 Query: 189 GVTEAAILAAGYAPAIGFVHTGKPLSF--VYDIADIIKFDTVVPKAFEIARRNPGEPDR- 245 G A+L+AG P++G H G+ +F V D+ ++ + V + F + E D Sbjct: 177 GHGIRAVLSAGLEPSLGVFHHGRSNAFCLVDDLLEVFRPA-VDAQVFGLLGDGEVEFDEV 235 Query: 246 ---EVRLACRDIFRSSKTLAKLIPLIEDVLA---AGEIQPPAPP 283 V +AC T+ + +++ PP Sbjct: 236 KHDLVDIACGKFSVDGLTIPAVFEDFAQQFGLYIEDDVEKLVPP 279 >UniRef50_B3E1C9 CRISPR-associated protein Cas1 n=1 Tax=Methylacidiphilum infernorum V4 RepID=B3E1C9_METI4 Length = 217 Score = 60.8 bits (146), Expect = 6e-08, Method: Composition-based stats. Identities = 24/111 (21%), Positives = 43/111 (38%), Gaps = 6/111 (5%) Query: 122 LRFGEP---APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 +RF R + +RG EG R + A+Q G + D +N Sbjct: 14 VRFKRACVGLVHARQLPAVRGWEGWASRHYWRWFAQQVN-QLGGFEERRTHGQTQDPVNL 72 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDT 227 ++ + L A+ AG P +G +H G+ + V D+ +I + + Sbjct: 73 ALNYGYALLRHRLGVAVRLAGLDPYLGVLHEANGRHEALVSDLVEIFRPEV 123 >UniRef50_D2QAN8 CRISPR-associated protein Cas1 n=2 Tax=Bifidobacterium dentium RepID=D2QAN8_9BIFI Length = 305 Score = 60.8 bits (146), Expect = 6e-08, Method: Composition-based stats. Identities = 47/260 (18%), Positives = 85/260 (32%), Gaps = 38/260 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGE 80 + G++ V +P+ A +++ S A + A+ G ++ Sbjct: 20 ERGRMKVRK------HGESESVCVPLAQAAVVLIGLRVCCSSAVLHEMAKAGVSVMLCDW 73 Query: 81 AGVRVYASGQPGGARSDKLLYQ-----AKLALDEDLRLKVVRKMFELRFGEPAPARRSVE 135 G+ A S+ + Q L ++ K+V+ ++R +E Sbjct: 74 RGIPDAALHSWTNVPSEVAVRQIAQSEMTLPRKKNAWAKIVKA--KIRGQASCLDSLGIE 131 Query: 136 ---QLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 LRGI EG R + + +G+R+ G N + Sbjct: 132 GGGALRGIASSVRSGDTSNYEGYAAREYWKRI-----FIGDGKRFKRI-PGDGTGRNAQL 185 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARR 238 A + L G AIL+AG P +G H G+ F D+ ++ + A Sbjct: 186 DYAYTILRGFAVKAILSAGLIPTLGVNHHGRSNYFCLADDLLEVYRPAIDYWVA--QLEP 243 Query: 239 NPGEPDREVRLACRDIFRSS 258 G D+ V+ D Sbjct: 244 EDGPSDKNVKRYLADSVNQQ 263 >UniRef50_C6HZN2 CRISPR-associated protein Cas1 n=1 Tax=Leptospirillum ferrodiazotrophum RepID=C6HZN2_9BACT Length = 245 Score = 60.4 bits (145), Expect = 8e-08, Method: Composition-based stats. Identities = 47/242 (19%), Positives = 86/242 (35%), Gaps = 46/242 (19%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT-L 74 ++L +DV +G+ +L ++ IP+ + +++ +S A + ++ GT Sbjct: 4 LYLDRKGLDLDVENGSLILREEGERIRSIPLTFLDRVVIRANISLSSAVLGELSESGTET 63 Query: 75 LVWVGEAG-----------------VRVYASGQPGGARSDKL--------LYQAK----- 104 +V G G +R YA R QA+ Sbjct: 64 VVLSGRQGRKVARIEGSRHNDARIRLRQYALFHDTTLRKRWAGRLVRSKIRSQARSLGTI 123 Query: 105 LALDEDLRLKVVRKMFELRFGEPAPARRSVE-----QLRGIEGSRVRATYALLAKQY--G 157 L + DL ++R M +L+ R+++ +L G+EG + +K + Sbjct: 124 LKVRPDLTSSLLRPMEQLQSIGEKIRERTLDPFEISELLGLEGGAGSLYFESFSKAFPPS 183 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSF 215 + + R P D N +S A + L+ G P IGF H S Sbjct: 184 LGFTSRNRRPPR----DPANAVLSLAYTLLHFDGVRTANMVGLDPLIGFYHEPAYGRDSL 239 Query: 216 VY 217 V+ Sbjct: 240 VF 241 >UniRef50_B8DWG2 CRISPR-associated protein Cas1 n=4 Tax=Bifidobacterium animalis subsp. lactis RepID=B8DWG2_BIFA0 Length = 535 Score = 60.0 bits (144), Expect = 9e-08, Method: Composition-based stats. Identities = 32/249 (12%), Positives = 68/249 (27%), Gaps = 37/249 (14%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + + G + T +P+ S+ + + VS +R ++W G Sbjct: 219 ARAYLKSGRMHVSKNGDEITSVPLDSIQALQIHGNVDVSSGLMRELMWRNIPILWCSGTG 278 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP------------- 129 R+ + + A+ + RL + R+ + Sbjct: 279 -RLMGWSVSSYGPNGET-RVAQHVASHEGRLDLAREFISAKIHNQIVLLRRSDKNNNVLF 336 Query: 130 ----------ARRSVEQLRGIEGSRVRATYALLAKQYGV------TWNGRRYDPKDWEKG 173 ++ + +EG ++ V W R P Sbjct: 337 DMKHIEKSVVNANRIQDILSLEGQAAALYFSQFHHLISVNKRNEWPWLERMRHPA----P 392 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPK 231 D +N + S L AI++ G GF+H+ K + D+ + + Sbjct: 393 DPLNALLDYTYSLLLSDCIRAIVSCGLDAHAGFLHSSKRNKPALALDLMEEFRAPIADSV 452 Query: 232 AFEIARRNP 240 + Sbjct: 453 VQTVVNNGE 461 >UniRef50_B5IHG3 CRISPR-associated protein Cas1 n=3 Tax=Aciduliprofundum boonei T469 RepID=B5IHG3_9EURY Length = 321 Score = 60.0 bits (144), Expect = 9e-08, Method: Composition-based stats. Identities = 51/284 (17%), Positives = 93/284 (32%), Gaps = 34/284 (11%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 +++ I + + + + G + IP+ ++ I S A++ G ++ + Sbjct: 4 LYITKEAIIKREANTIYLVRKGEKRSIPIHNLRDITCIAPVSFSSGAIKHVLNSGVVVHF 63 Query: 78 VGEAGVRVYASGQPGGARS---DKLLYQAKLALDEDLRLKVVRKMFE------------- 121 G Y RS + ++ QAK + + R + ++M E Sbjct: 64 FDMYG--NYEGTLYPRERSISGEVVVNQAKHYIFWEKRKYIAKEMIEGIKHNILRNLKKS 121 Query: 122 --------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEK 172 + S+E+L E Y L R Y P E Sbjct: 122 NKELEEIIEKIERVEVEGDSIEELMNREAQIWGYYYKSLDYTLKKFQLERRDYRPPINE- 180 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVP 230 +N IS + LY I P+I ++H S DIAD+ K V Sbjct: 181 ---LNALISFGNTLLYSAVLTEIYHTHLNPSISYLHEPSERRFSLSLDIADVFKPTMVYR 237 Query: 231 KAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 +I + + R +F + + K I ED +++ Sbjct: 238 HIHDIVNHGII-TEDDFRKEFNGVFLNEQGKRKYIRKWEDRMSS 280 >UniRef50_A5UJ50 Uncharacterized protein predicted to be involved in DNA repair n=5 Tax=Methanobrevibacter RepID=A5UJ50_METS3 Length = 334 Score = 60.0 bits (144), Expect = 9e-08, Method: Composition-based stats. Identities = 19/115 (16%), Positives = 42/115 (36%), Gaps = 8/115 (6%) Query: 132 RSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 S ++ G EG + ++ + + R P++ D +N ++ + L Sbjct: 156 TSKNKIMGFEGIASVNYWEAVSLLLPDEINFKKRNQKPEN----DVVNSMLNYGYAILAS 211 Query: 190 VTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 I+ G P G +H + S ++D+ + + V F++ N Sbjct: 212 EIAKNIVTLGLDPYCGLLHADLKRRQSLIFDLIEEFRQQIVDKTVFKLINTNQIN 266 >UniRef50_Q8YWX6 Alr1468 protein n=4 Tax=Cyanobacteria RepID=Q8YWX6_ANASP Length = 668 Score = 60.0 bits (144), Expect = 1e-07, Method: Composition-based stats. Identities = 42/238 (17%), Positives = 80/238 (33%), Gaps = 39/238 (16%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + V + F + + +R +PV V+ ++L VSH AV +A + ++++ + G R Sbjct: 347 LSVKNQQFQVFHQGELRIKVPVMRVSNVVLFGCCNVSHGAVSMALRRRIPIMYLSQKG-R 405 Query: 85 VY--------------------ASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL-- 122 + + + + +AKL L LK+ R+ Sbjct: 406 YFGRLQTEGDAKVEYLMLQVERCQNHEFTRKQAEAIVRAKLHNSRALLLKLNRRHPSKIA 465 Query: 123 --------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEK 172 E S++ LRG EG + L + + R P Sbjct: 466 ATAISGIAELMEKLSLAESMDSLRGYEGKAATLYFQGLGSLFTGAFVFEKRTKRPP---- 521 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADIIKFDTV 228 D +N +S + L + + G G +H + V D+ + + V Sbjct: 522 TDPVNSLLSLGYTLLSQNVFSFVQVIGLHTHFGNLHVPRDNHPALVSDLMEEFRAQLV 579 >UniRef50_A6DE65 Putative uncharacterized protein n=1 Tax=Caminibacter mediatlanticus TB-2 RepID=A6DE65_9PROT Length = 267 Score = 60.0 bits (144), Expect = 1e-07, Method: Composition-based stats. Identities = 29/142 (20%), Positives = 53/142 (37%), Gaps = 11/142 (7%) Query: 133 SVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 +++L G EGS + ++K N + D IN ++ + LY + Sbjct: 96 KIDELMGYEGSIANIYWNGISKIL----NEEDFKRITKGATDRINTALNYGYAILYNKVQ 151 Query: 193 AAILAAGYAPAIGFVHT-GKPLSFVYDIADIIKFDTVVP-KAFEIARRNPGEP---DREV 247 A++ AG I F+HT K V+D + + V F + + + + Sbjct: 152 KALIKAGLGINISFLHTFEKKPVLVFDFIEQFRCVAVDKAICFSLKKSDDIDVNNKGMLT 211 Query: 248 RLACRDIFRSSKTLAKLIPLIE 269 R A R I + +L + Sbjct: 212 REAKRRIV--EEVNERLATFYK 231 >UniRef50_A0LHZ4 CRISPR-associated protein, Cas1 family n=2 Tax=Deltaproteobacteria RepID=A0LHZ4_SYNFM Length = 350 Score = 59.6 bits (143), Expect = 1e-07, Method: Composition-based stats. Identities = 31/142 (21%), Positives = 51/142 (35%), Gaps = 13/142 (9%) Query: 130 ARRSVEQLRGIEGSRVRATYAL---LAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 ++ ++ LRGIEG + + L + G +NGR P D +N +S + Sbjct: 147 RQKDMDLLRGIEGHAANLYFEVFPLLVRVPGFEFNGRNRRPP----LDPLNALLSFVYTL 202 Query: 187 LYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 L AI G P +G +H G+P S D+ + + + R Sbjct: 203 LTQEVLTAIKVVGLDPYLGCLHAVDYGRP-SLACDLVEEWRTFLGDRLVLALVNRRVIGL 261 Query: 244 DREVRL--ACRDIFRSSKTLAK 263 D V C D + + Sbjct: 262 DDFVYRPTPCADAVDEEELKHR 283 >UniRef50_C0WRP8 CRISPR-associated protein n=18 Tax=Lactobacillaceae RepID=C0WRP8_LACBU Length = 327 Score = 59.6 bits (143), Expect = 1e-07, Method: Composition-based stats. Identities = 47/223 (21%), Positives = 88/223 (39%), Gaps = 19/223 (8%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTR--VSHAAVRLAAQVGT 73 S+I Q+ ++ ++ GI IP+ ++ +++ TR ++ A + AQVGT Sbjct: 31 SVIITQHAKLSYSSHMMIVQTNDGIN-QIPIDDISILLIST-TRAVITTALISELAQVGT 88 Query: 74 LLVWVGEAGVR-VYASGQPGGARSDKLLYQAKLALDEDLRLK----VVRKMFE----LRF 124 +++ A G RS KLL + ++ + V KM L F Sbjct: 89 KVIFTDGANQPICETVGYYPNNRSVKLLQEQVNWDEQRKEVLWTKIVASKMINQVNVLTF 148 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + + E+L +E + A++A++Y + ++ G IN + Sbjct: 149 YKKDTTEVN-EELAKLEVADPSNREAVVARKYFPLLFNNDFSRRN---GSAINAALDYGY 204 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF 225 S L I++ GY IG H G+ F D+ + + Sbjct: 205 SILLSSINQEIVSNGYLTYIGIHHRGEDNPFNLGSDLMEPFRP 247 >UniRef50_Q8TVS6 Uncharacterized conserved protein n=1 Tax=Methanopyrus kandleri RepID=Q8TVS6_METKA Length = 331 Score = 59.6 bits (143), Expect = 1e-07, Method: Composition-based stats. Identities = 52/242 (21%), Positives = 87/242 (35%), Gaps = 40/242 (16%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++IFL+ G+I+ + AF + G + V + I+ G +++ AVRLA + + Sbjct: 7 TVIFLRRGKIERREDAFRI----GKSKYSAVRTTGIIIAG-GAQITTQAVRLALRNEVPI 61 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE---------LRF-- 124 V++G + + L Q ++A RL R + L F Sbjct: 62 VYLGGNRILGVTVPFSE-RYATLRLKQYEIASQPSARLAFARPLIASSILARAAVLEFLA 120 Query: 125 ------------------GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 E A S + LRG EG + LA+ W Sbjct: 121 NETGITGLEDAADEVRSEAERALNAGSTDALRGYEGRAACRYFRALAEVLP-DW-AFSGR 178 Query: 167 PKDWEKGDTINQCISAATS-CLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADII 223 D N IS + L V + +AAG P +GF+H G+ + D+ + Sbjct: 179 RTRRPPRDPFNAAISFGYAGVLLPVLLSRTVAAGLEPFLGFLHGPRGRRPGLILDLMEEW 238 Query: 224 KF 225 + Sbjct: 239 RA 240 >UniRef50_B0JKW9 CRISPR-associated protein Cas1 n=6 Tax=Cyanobacteria RepID=B0JKW9_MICAN Length = 334 Score = 59.2 bits (142), Expect = 2e-07, Method: Composition-based stats. Identities = 34/181 (18%), Positives = 67/181 (37%), Gaps = 11/181 (6%) Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEG 142 VR + G+ R+ L Q ++L L+ E G +++ LRG+EG Sbjct: 104 VRGFVRGKLKNYRNILLRRQRD---RKELDLQTAIACLEAAIG-SIETTSAIDSLRGLEG 159 Query: 143 SRVRATYALL-AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYA 201 + A ++ + G T+ + + D +N +S + L ++AI G+ Sbjct: 160 AGSAAYFSCFDSLILGDTFTFASRNRRPPR--DPVNSLLSLGYALLRHDVQSAINLVGFD 217 Query: 202 PAIGFVH---TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 P +G++H G+P S D+ + + V P+ + Sbjct: 218 PYLGYLHYQRYGRP-SLALDLMEEFRAIVVDAVVLNGVNHPYLTPEHFTTEPLSGAVSLT 276 Query: 259 K 259 + Sbjct: 277 R 277 >UniRef50_C6I8L1 CRISPR-associated protein n=1 Tax=Bacteroides sp. 3_2_5 RepID=C6I8L1_9BACE Length = 756 Score = 58.9 bits (141), Expect = 2e-07, Method: Composition-based stats. Identities = 29/143 (20%), Positives = 54/143 (37%), Gaps = 16/143 (11%) Query: 136 QLRGIEGSRVRATY---ALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 +L IE A + +L G+ + R + D +N ++ + LY Sbjct: 584 ELMAIESQAAIAYWSYIRVLTADDGIDFIRREHQ----GATDLLNSLLNYGYAILYARVW 639 Query: 193 AAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 ILAA P+IG +H + + V+D+ ++ + V + ++V L Sbjct: 640 KNILAAKLNPSIGVLHAKQDGKPTLVFDVVELFRAQMVDRVVISL-------IQKKVSLK 692 Query: 251 CRDIFRSSKTLAKLIPLIEDVLA 273 D + + LI I + L Sbjct: 693 MHDGLLNESSKRVLIRYILERLN 715 >UniRef50_B9LX94 CRISPR-associated protein Cas1 n=2 Tax=Halobacteriaceae RepID=B9LX94_HALLT Length = 331 Score = 58.9 bits (141), Expect = 2e-07, Method: Composition-based stats. Identities = 41/260 (15%), Positives = 78/260 (30%), Gaps = 32/260 (12%) Query: 16 SMIFLQY--GQIDVIDGAFVLIDKTGIRTHI---PVGSVACIMLEPGTRVSHAAVRLAAQ 70 S++++ Q+ G + D G + P + I + G S V A + Sbjct: 11 SVVYVTKQGSQVGTEGGRITVWDVDGDEGELASFPTEKLDTINVFGGVNFSTPFVAEANR 70 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA-- 128 G +L + + G Y + + +A+ LDE + + M + Sbjct: 71 HGIILNYFTQNGK--YRGSFVPEKNTIAEVRRAQYDLDETAEIDIAADMIAAKIRNARTL 128 Query: 129 ----------------PARRSVEQ---LRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 +V LRG+EG + L + W + + Sbjct: 129 LSRKGVHGTELLKDLGVRATTVATKDGLRGVEGEAAERYFNRLDETLTDGWTFEKRTKRP 188 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 E D IN +S + +A+ P +G +H + S D+ + + Sbjct: 189 PE--DHINSLLSLTYVFMKNEVLSALRQYNLDPFLGVLHADRHGRPSLALDLQEEFRPIF 246 Query: 228 VVPKAFEIARRNPGEPDREV 247 + R D Sbjct: 247 CDAFVTRLVNRGVITHDEFT 266 >UniRef50_Q1WVJ8 CRISPR-associated protein n=1 Tax=Lactobacillus salivarius UCC118 RepID=Q1WVJ8_LACS1 Length = 301 Score = 58.9 bits (141), Expect = 2e-07, Method: Composition-based stats. Identities = 40/217 (18%), Positives = 85/217 (39%), Gaps = 17/217 (7%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTR-VSHAAVRLAAQVGTLLVWVG 79 Q+ ++ + A ++ +K GI IP+ + +++ ++ A V A+ G +++ Sbjct: 10 QHSKLSYSNNAMIVQNKDGIN-QIPLVDMDILLISTTQAVITSALVSKLAESGIKVIFTD 68 Query: 80 EAGVRVY-ASGQPGGARSDKLLYQAKLALDEDL--------RLKVVRKMFELRFGEPAPA 130 V RS Q D R K++ ++ L+ + Sbjct: 69 NKNEPVTETVNYYPNNRSLDTYLQQYEWNDHVKEILWTKIVRSKIINQIKVLKNYQIDCQ 128 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 +E L +E + + A++A++Y G +Y KD+ +N ++ S L Sbjct: 129 DLKIE-LDKLEINDMTNREAVVARKYFEKLFGNKYSRKDFT---PMNAALNYGYSILLSA 184 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF 225 I++AGY +G H + F + D+ + + Sbjct: 185 VNKEIVSAGYVTYLGIHHQSQENMFNFGSDLMESFRP 221 >UniRef50_Q1CW50 CRISPR-associated fusion protein Cas4/Cas1 n=5 Tax=Bacteria RepID=Q1CW50_MYXXD Length = 568 Score = 58.1 bits (139), Expect = 3e-07, Method: Composition-based stats. Identities = 47/264 (17%), Positives = 80/264 (30%), Gaps = 44/264 (16%) Query: 24 QIDVIDGAFVLI--DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 ++ V+ + G + P V+ ++ +VS A+ + + W Sbjct: 241 RVGRAAEELVVTPPEGEGAPSRQPGRMVSALIAHGAVQVSAQALAYCVENDIGVHWFTSG 300 Query: 82 GVRVYASGQPGGARSDKLL-----YQAKLALDEDLRLKVVRKMFELRF------GEPA-- 128 G + G G +L QA + L RL + +LRF G+ Sbjct: 301 GRYLGGLGGGAGNVHRRLRQFEALRQASVCLGLARRLVAAKLEGQLRFLLRASRGDSESR 360 Query: 129 -----------------PARRSVEQLRGIEGSRVRATYALLAK------QYGVTWNGRRY 165 S+E L G+EG+ + L + + GR Sbjct: 361 QVLASAVRDLRALLPKCEEAPSLEVLLGLEGAGAARYFGALPYLQGEDVDTRLRFEGRNR 420 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADII 223 P D N + ++ EAAI A G A GF H G D+ ++ Sbjct: 421 RPPR----DRFNAVLGFLFGLVHREVEAAIRAVGLDVAFGFYHQPRGTAGPLGLDVMELF 476 Query: 224 KFDTVVPKAFEIARRNPGEPDREV 247 + R + D + Sbjct: 477 RVPLADMPLVASVNRRAWDADADF 500 >UniRef50_Q03KT5 CRISPR-associated protein, Cas1 family n=5 Tax=Streptococcus RepID=Q03KT5_STRTD Length = 334 Score = 58.1 bits (139), Expect = 4e-07, Method: Composition-based stats. Identities = 40/251 (15%), Positives = 83/251 (33%), Gaps = 44/251 (17%) Query: 15 VSMIFLQYGQ--IDVIDGAFVLI-DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S ++ Q + + + ++ D I + + V ++L +++ ++ ++ Sbjct: 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSD--KLLYQAKLALDEDLRLKVVRKM---------- 119 + + G + S + + K QAK +ED RL+V R + Sbjct: 61 KVNVYYFSNVG--QFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHQIA 118 Query: 120 ---------------FELRFGE---PAPARRSVEQLRGIEGSRVRAT--YALLAKQYGVT 159 + RF + S+ ++ G EG ++ Y L Sbjct: 119 LLREFDTDGLLDTSDYS-RFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPDDFH 177 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 +NGR D N ++ S LY I G + G +H + Sbjct: 178 FNGRSRR----TAEDCFNSALNFGYSILYSCLMGLIKKNGLSLGFGVIHKHHQHHATLAS 233 Query: 218 DIADIIKFDTV 228 D+ + + V Sbjct: 234 DLMEEWRPIIV 244 >UniRef50_Q5X8T5 Putative uncharacterized protein n=1 Tax=Legionella pneumophila str. Paris RepID=Q5X8T5_LEGPA Length = 330 Score = 57.7 bits (138), Expect = 5e-07, Method: Composition-based stats. Identities = 29/113 (25%), Positives = 43/113 (38%), Gaps = 8/113 (7%) Query: 136 QLRGIEGSRVRATYALLAKQYGV--TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEA 193 L G EGS R + + + V + GRR D N ++ A S L Sbjct: 164 TLLGFEGSAARIYWETVKEVNLVCSDFEGRRPRVNK----DITNSMLNYAYSILSSWVWQ 219 Query: 194 AILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 AI AG G +HT KP S V D+ + + F++ R + + Sbjct: 220 AITNAGLELYAGILHTSKPGKPSLVLDLMEEFRPWCADRIIFKLHARAQKQNE 272 >UniRef50_A4FJX8 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Saccharopolyspora erythraea NRRL 2338 RepID=A4FJX8_SACEN Length = 549 Score = 57.3 bits (137), Expect = 7e-07, Method: Composition-based stats. Identities = 41/226 (18%), Positives = 65/226 (28%), Gaps = 45/226 (19%) Query: 55 EPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLK 114 P +V+ AV A+ G+ +VW G G + Q + L Sbjct: 261 GP-VQVTSQAVHALAEQGSPVVWTSTTGRLKSVDIPTVGKHVELRRRQFT--ATPNTALD 317 Query: 115 VVRKMFE----------LRFGEPAPARR---------------SVEQLRGIEGSRVRATY 149 R++ R + L GIEG+ R + Sbjct: 318 FARRIVGGKIRNARTLLRRNPHVEEPDLLNRLDADAVRAERAPTRSTLLGIEGAAARTYF 377 Query: 150 ALLAKQYGVTWN---------GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGY 200 A L + + GR P D ++ +S L A A G Sbjct: 378 AGLVETFRTDHRLPGPAFDTMGRTRRPPR----DAVSCLLSFLYCLLIKDITTACYALGL 433 Query: 201 APAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 P GF H G+P + D+A+ + A + +P Sbjct: 434 DPYFGFYHQPRHGRP-ALTLDLAEEFRPLIADSTALTLINNLQADP 478 >UniRef50_B3EG05 CRISPR-associated protein Cas1 n=11 Tax=Bacteria RepID=B3EG05_CHLL2 Length = 384 Score = 56.9 bits (136), Expect = 8e-07, Method: Composition-based stats. Identities = 22/120 (18%), Positives = 45/120 (37%), Gaps = 11/120 (9%) Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYG-----VTWNGRRYDPKDWEKGDTINQCISAAT 184 A + + L G+EG+ + Y + + +++ R P D + +N +S Sbjct: 197 AAQDIPSLMGVEGNIRKVYYQVWQQLLRSADPAFSFSERVKRPPD----NAVNALVSFGN 252 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGE 242 S +Y I P + F+H S D+A++ K + F++ + Sbjct: 253 SLMYSACLTEIYRTQLNPTVSFLHEPSERRFSLALDMAEVFKPMFIDRLIFKLVNTRAIQ 312 >UniRef50_B8GLF6 CRISPR-associated Cas1/Cas4 family protein n=1 Tax=Thioalkalivibrio sp. HL-EbGR7 RepID=B8GLF6_THISH Length = 570 Score = 56.9 bits (136), Expect = 9e-07, Method: Composition-based stats. Identities = 24/126 (19%), Positives = 44/126 (34%), Gaps = 17/126 (13%) Query: 130 ARRSVEQLRGIEGSRVRATYALLAK---------QYGVTWNGRRYDPKDWEKGDTINQCI 180 ++++L GIEG+ + A+ + + R P D +N + Sbjct: 379 RASNLQELLGIEGAAASRYFGAFARLLKHSDAGPELTFDFTTRNRRPP----TDPVNALL 434 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIAR 237 S A + L A++ A G P GF H G+P + D+ + + + Sbjct: 435 SYAYALLTRSWTASLSAVGLDPYRGFYHQPRYGRP-ALALDMMEPFRPLIADSSVIQAIN 493 Query: 238 RNPGEP 243 P Sbjct: 494 NGEVRP 499 >UniRef50_Q8F874 Putative uncharacterized protein n=1 Tax=Leptospira interrogans RepID=Q8F874_LEPIN Length = 254 Score = 56.6 bits (135), Expect = 1e-06, Method: Composition-based stats. Identities = 27/148 (18%), Positives = 52/148 (35%), Gaps = 8/148 (5%) Query: 128 APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI---NQCISAAT 184 S+E +RG EG+ + +++ Y + + + N +S Sbjct: 65 LEKAESIESIRGYEGASAKTYFSVF--DYCIIQQKEDFQFHKRTRRPPRSRTNALLSFLY 122 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGE 242 S L A A G P IGF+H +P S D+ + + + F + R + Sbjct: 123 SLLTNDCIAVCQAVGLDPYIGFLHDERPGRPSLALDMMEEFRP-FIDRLVFTLINRKQIQ 181 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIED 270 + F + + +LI ++ Sbjct: 182 VSDFLEKPGSVFFINDDSRKELIKSYQE 209 >UniRef50_A1RZT8 CRISPR-associated protein Cas1 n=1 Tax=Thermofilum pendens Hrk 5 RepID=A1RZT8_THEPD Length = 327 Score = 56.6 bits (135), Expect = 1e-06, Method: Composition-based stats. Identities = 25/91 (27%), Positives = 43/91 (47%), Gaps = 9/91 (9%) Query: 150 ALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 L ++YG + GR+ D +N+ ISA + LY ++ A++AAG P GF+H Sbjct: 170 RELRERYG--FAGRK-----PGHPDPLNKAISAMYAVLYTLSTKALVAAGLDPTYGFLHR 222 Query: 210 GK-PLSFVYDIADIIKFDTVVPKAFEIARRN 239 + + +D A+ K V A ++ Sbjct: 223 TQYSVPLAFDYAEAFKP-LAVEAALDLVNEE 252 >UniRef50_A7I668 CRISPR-associated protein Cas1 n=1 Tax=Candidatus Methanoregula boonei 6A8 RepID=A7I668_METB6 Length = 310 Score = 56.6 bits (135), Expect = 1e-06, Method: Composition-based stats. Identities = 24/140 (17%), Positives = 54/140 (38%), Gaps = 5/140 (3%) Query: 133 SVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 ++++R + Y +L++ ++ RR + + D IN +S + L+G Sbjct: 150 KLDEIRRLSDLTANMYYEILSRDIPREFDFRRRTVR--PQCDPINAMLSFGYAMLFGNCC 207 Query: 193 AAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR--NPGEPDREVRLA 250 ++ A P +G ++ G + V D+ K + + IAR N G+ + Sbjct: 208 VPVIGARLDPDLGILYEGS-GALVQDLMASFKAQMIDGVIWSIARDSLNVGDFEITSNRC 266 Query: 251 CRDIFRSSKTLAKLIPLIED 270 ++ I++ Sbjct: 267 ILSDNLIQNLMSSFRKSIDN 286 >UniRef50_D0LSW9 CRISPR-associated protein Cas1 n=1 Tax=Haliangium ochraceum DSM 14365 RepID=D0LSW9_HALO1 Length = 594 Score = 56.2 bits (134), Expect = 1e-06, Method: Composition-based stats. Identities = 46/260 (17%), Positives = 85/260 (32%), Gaps = 38/260 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTH-IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVG 79 ++G + A ++I + G + + V+ I L ++ A++ A G + Sbjct: 273 EHGAVVSKRAAELVIKRKGSELERVRIKDVSRINLHGSAHITLPALQTALGNGIPVGLFT 332 Query: 80 EAGVRVYASGQPGGARSDKLLYQAKLAL--DEDLRLKVVRKMFELRFGEPAP-------- 129 G Y G + LL QA+ A DE L++ +++ + Sbjct: 333 YGG--WYYGRAQGHDHKNVLLRQAQFASAQDEGRCLRIAQRLVHAKIKNSRVMLRRNSRA 390 Query: 130 -----------------ARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGRRYDPKDW 170 S L GIEGS R + + V ++ + + Sbjct: 391 LDRRILDDLSGHARRARQADSQATLLGIEGSAARLYFQNFSGMLRQDVPFSFDSRNRRPP 450 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDT 227 D +N +S + + L A + G+ P GF H G+P S D+ + + Sbjct: 451 R--DPVNALLSFSYALLTAEWTATLSTVGFDPYQGFYHQPRYGRP-SLALDLMEEFRPLI 507 Query: 228 VVPKAFEIARRNPGEPDREV 247 + D V Sbjct: 508 ADSVVIGAINNGVLDEDDFV 527 >UniRef50_D0W646 CRISPR-associated protein Cas1 n=1 Tax=Neisseria cinerea ATCC 14685 RepID=D0W646_NEICI Length = 325 Score = 55.8 bits (133), Expect = 2e-06, Method: Composition-based stats. Identities = 35/226 (15%), Positives = 73/226 (32%), Gaps = 39/226 (17%) Query: 18 IFLQYGQI-DVIDGAFVLIDKTGIRTH-IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ + DG ++ + G RT IP+ + C+ + +S + + + Sbjct: 4 LYIDRKNLTLRADGDSLVCYENGERTATIPLKVLQCVCIRGDLTLSAKVLGKLGEADIGV 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK------------MFEL- 122 + + R + + Q + D+ L R + ++ Sbjct: 64 LVLNGRLKRPALMLPNLKLDGSRRVAQYAFSQDKAACLAAARNTVSAKLSAQQQHLQQMM 123 Query: 123 ------------------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVT--WNG 162 R + P + +LRGIEG+ + A T + G Sbjct: 124 PSDGTAADCLNKHIKAISRLADTVPDCNGIARLRGIEGAAAAQYFGAWAAVLPETLHFTG 183 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH 208 R P D +N +S + ++ + +G P IG+ H Sbjct: 184 RNRRPPR----DPVNAALSLTYTLMHFEIVKHLHLSGLDPFIGYYH 225 >UniRef50_B7KM77 CRISPR-associated protein Cas1 n=1 Tax=Cyanothece sp. PCC 7424 RepID=B7KM77_CYAP7 Length = 334 Score = 55.4 bits (132), Expect = 2e-06, Method: Composition-based stats. Identities = 25/163 (15%), Positives = 56/163 (34%), Gaps = 14/163 (8%) Query: 128 APARRSVEQLRGIEGSRVRATYALLAK---QYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + +++QLRG EG + + +++ R P D +N +S Sbjct: 145 LESVDNLDQLRGYEGIAAARYFPAFGQLITNAAFSFSLRNRQPP----TDPVNSLLSFGY 200 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGE 242 + L+ + I++ G +P G H G+ +D+ + + V + Sbjct: 201 TLLFNNVLSLIISEGLSPYFGNFHYGERDKPYLAFDLMEEFRAIIVDGMVLRVINNGLLT 260 Query: 243 PDREVRLACRDIFRSSKTLAKLI-----PLIEDVLAAGEIQPP 280 +A + ++ I +++ +IQ P Sbjct: 261 LKDFEPVASNGGVYLTDKGRRIFLKEFESRINKLISHPDIQSP 303 >UniRef50_B1GZM4 CRISPR-associated protein Cas1 n=1 Tax=uncultured Termite group 1 bacterium phylotype Rs-D17 RepID=B1GZM4_UNCTG Length = 298 Score = 55.4 bits (132), Expect = 3e-06, Method: Composition-based stats. Identities = 32/231 (13%), Positives = 78/231 (33%), Gaps = 10/231 (4%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVGEA 81 + V + + IP+ ++ I+LE ++ A + +Q ++ + Sbjct: 12 CSLSVKNSQLYCRFQDSTVHDIPIEDISVIVLESNRINLTSALISECSQSNIVIFSCDSS 71 Query: 82 GVRV--YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG 139 + Y + Q K R+ ++ + S G Sbjct: 72 HIPCGIYVPFNQHSRFTQTANSQVKWDTAFKNRIWQKIVKQKICNQAEIIKKYSFANYVG 131 Query: 140 IEGS--RVRATYAL--LAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAI 195 ++G+ RV++ A + W + + + D N ++ + + GV ++ Sbjct: 132 LKGTCDRVQSGDKTNCEAFAAKIYWESIFENFQRNKNSDIRNSALNYGYAIVRGVVARSL 191 Query: 196 LAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPD 244 ++G+ G H+ +F DI + + V A +I + + + Sbjct: 192 ASSGFITCFGVHHSNDLNAFNLADDIIEPFRP-FVDDIAIDIFKNSETSKE 241 >UniRef50_UPI0000F51762 hypothetical protein Faci_00015 n=1 Tax=Ferroplasma acidarmanus fer1 RepID=UPI0000F51762 Length = 328 Score = 55.4 bits (132), Expect = 3e-06, Method: Composition-based stats. Identities = 41/234 (17%), Positives = 75/234 (32%), Gaps = 35/234 (14%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + V G ++ + G + IPV +V + +S V + +++G + + G Sbjct: 14 VYVDGGTIIIENSIG-KNAIPVENVRSVYAHKPVSISSGVVSIISKLGIPIHFFNWYG-- 70 Query: 85 VYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------FELRFGEPAPAR--- 131 Y + + D ++ QA+ LD R+ + + F E Sbjct: 71 NYEATLWPKSKDISGDVIIKQAQKYLDIHERINIAKSFVSGALHNFNRILSEYDTETVKK 130 Query: 132 ---------------RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 + ++ GIEG + + + + R G+ Sbjct: 131 SREDIKSNIGNLTNAMDITEIMGIEGRSHNSYFKAMDSV--IPEKFRINKRIRRPPGNMG 188 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTV 228 N IS S +Y I P I ++H S DIA+I K Sbjct: 189 NALISFGNSLVYASVLTEIYFTHLNPTISYLHEPSERRFSLSLDIAEIFKPIIS 242 >UniRef50_D0MJ58 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MJ58_RHOM4 Length = 360 Score = 54.6 bits (130), Expect = 4e-06, Method: Composition-based stats. Identities = 50/281 (17%), Positives = 85/281 (30%), Gaps = 52/281 (18%) Query: 10 PLKDRVSMIFLQYG---QI--DVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 L+ R + +F + +I D + TG T PV V + ++ Sbjct: 12 RLRRRQNTLFFEKAAGERIPDDQDETGVPSGTPTGESTPFPVEQVESLYFFGEVDLNSKL 71 Query: 65 VRLAAQVGTLLVWVGEAGVRVYAS---GQPGGARSDKLLYQAKLALDEDLRLKVVR---- 117 + A+ + G Y + + Q + R + R Sbjct: 72 LTFLARHDIPAHFYDYYG--NYTGTYIPRDYLHSGRLRIEQVLHYVRPKRRRYLARAIVE 129 Query: 118 ----------KMFELRFGEPA-----PARRSVEQ-------------LRGIEGSRVRATY 149 + + R A ++EQ L GIEG R R Y Sbjct: 130 AATYNLLRVLRYYVNRLEGERREAVAEAIATIEQERTQLRSAEKIPELMGIEG-RSREAY 188 Query: 150 -----ALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAI 204 ++LA G + + + + + IN IS S Y T I P I Sbjct: 189 YSAWPSILADGPGEAFVFEKRERRPP--SNEINALISFGNSLCYTTTIRQIHRTALDPTI 246 Query: 205 GFVHT--GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 ++H + S D+++I K V F + + P Sbjct: 247 SYLHEPGARRFSLALDLSEIFKPILVDRAIFRLVKTGEITP 287 >UniRef50_C3WD45 CRISPR-associated protein cas1 n=1 Tax=Fusobacterium mortiferum ATCC 9817 RepID=C3WD45_FUSMR Length = 327 Score = 54.6 bits (130), Expect = 4e-06, Method: Composition-based stats. Identities = 18/92 (19%), Positives = 34/92 (36%), Gaps = 4/92 (4%) Query: 135 EQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAA 194 ++ G EG Y + K W+ + + N ++ A LY E Sbjct: 157 NRVMGYEGRSSILYYEEIKKFLPENWSFT--KRETQGAKEPYNIVLNYAFGILYFKLERY 214 Query: 195 ILAAGYAPAIGFVHT--GKPLSFVYDIADIIK 224 + AG +G +H+ K S ++D + + Sbjct: 215 LTLAGLDIQLGIIHSNNNKSNSLIFDFIEPFR 246 >UniRef50_A8REI1 Putative uncharacterized protein n=2 Tax=unclassified Erysipelotrichaceae RepID=A8REI1_9FIRM Length = 300 Score = 54.6 bits (130), Expect = 4e-06, Method: Composition-based stats. Identities = 31/241 (12%), Positives = 84/241 (34%), Gaps = 53/241 (21%) Query: 17 MIFLQ---YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVG 72 +++++ + Q+ + + + ++ + P+ + ++++ + ++ V + Sbjct: 5 VLYIENQYHLQLYLDN---LKVETSQGDIQFPISDIQILVIDHYRSTLTVPLVNKLTENN 61 Query: 73 TLLVWVGEAGVR-VYASGQPGG-ARSDKLLYQAKLALDEDLRL----------------- 113 ++ G + Y G A+S ++ Q +E ++ Sbjct: 62 VCVIICGIDHLPKSYILPMNGHFAQSGNIMKQIT-WSNEIKQILHQQIVKAKIFNQIEIL 120 Query: 114 -------KVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 +V++K++E + + EG + + + +G + Sbjct: 121 KTNQCKYEVIKKLYEF-YDTVDLGDAT-----NREGLAAKMYFREM---FGNDFIRFE-- 169 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIK 224 D IN ++ S + + I+ GY P +G H GK F DI ++ + Sbjct: 170 ------DDVINAGLNYGYSIFRSLISSIIVGKGYLPNLGIFHKGKTNMFNLSDDIIEVFR 223 Query: 225 F 225 Sbjct: 224 P 224 >UniRef50_B3W9S5 CRISPR-associated protein n=4 Tax=Lactobacillus RepID=B3W9S5_LACCB Length = 301 Score = 54.6 bits (130), Expect = 4e-06, Method: Composition-based stats. Identities = 36/238 (15%), Positives = 82/238 (34%), Gaps = 45/238 (18%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTR--VSHAAVRLAAQVGT 73 S+I Q+ ++ + V+ G ++P+ + ++ TR +S A+ A+ Sbjct: 5 SLIVTQHCKVTTKNRTLVVQT-DGEVNNVPIEDINQVVFTT-TRALLSADAITTLAEANA 62 Query: 74 LLVWVGEAGVR------VYASGQPGGARSDKLLYQA----KLALDEDLRLKVVRKMFELR 123 +++ G G +Y+ ++D + Q L + ++ + + + Sbjct: 63 KVIFSGRDGQPVTETTNLYS----DRRKADLVRLQVNWPKSLVENLWTKIVAAKVSNQAQ 118 Query: 124 ------FGEP----APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKG 173 F + E + R + L+ +G D+ + Sbjct: 119 VTKLCGFDNQSLLDDLDTLEINDRSNREATAARKYFKLI---FG----------DDFSRS 165 Query: 174 D--TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDT 227 D N ++ S L T AI++AG+ IG H+ + D+ + + Sbjct: 166 DICATNAALNYGYSILLSTTNRAIVSAGHITEIGMHHSSVANQYNLGSDLMEPFRPAI 223 >UniRef50_Q2FPW6 CRISPR-associated protein Cas1 n=1 Tax=Methanospirillum hungatei JF-1 RepID=Q2FPW6_METHJ Length = 303 Score = 54.6 bits (130), Expect = 5e-06, Method: Composition-based stats. Identities = 43/269 (15%), Positives = 83/269 (30%), Gaps = 35/269 (13%) Query: 26 DVIDGAFVL-IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + I G T IP+ ++ +L G + + + + G + + G Sbjct: 7 HIKSNRTEITIQHKGKITDIPIKDLSHFLLIGGHTIQTSTITSLVKEGVFISFCESDGEP 66 Query: 85 V-YASGQPGGARSDKLLYQ--------AKLALDEDLRLKV-VRKMFELRFG--------- 125 V Y S + Q A +E ++ ++ + + G Sbjct: 67 VGYISPYDYSLFKEIQNLQKTAAPYSYALACANESIKSRILAIEKYAEEIGPEILFSGEL 126 Query: 126 -------EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVT--WNGRRYDPKDWEKGDTI 176 + +E+LR IE Y +L + T + R P D + Sbjct: 127 DILTGYAKELENMVLIEELRRIEQLVRDMYYEILGRLISPTYLFKRRTSRP----YLDPV 182 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIA 236 N S L A++ P G+++ G + V D+ + K + A Sbjct: 183 NAIFSFGYGMLSSACTRAVIGGHLDPGHGYLNRGNQ-ALVQDLMNCWKPKMIDNHAIGFL 241 Query: 237 RRNPGEPDREVRLACRDIFRSSKTLAKLI 265 R + R R + + +LI Sbjct: 242 RSGRLHQNGYERTKDR-CILHDEVIEELI 269 >UniRef50_A1WH94 CRISPR-associated protein, Cas1 family n=5 Tax=Proteobacteria RepID=A1WH94_VEREI Length = 309 Score = 54.2 bits (129), Expect = 5e-06, Method: Composition-based stats. Identities = 44/262 (16%), Positives = 85/262 (32%), Gaps = 39/262 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTH---IPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLV 76 + V G V+ D G R +P+ +A ++ G ++ + A+ G V Sbjct: 11 DRRHLFVNRGFMVIKDTEGERKELGQVPLDDIAAVIANAHGLTYTNNLLVALAERGAPFV 70 Query: 77 WVGEAGVRVYASGQPGGAR-----SDKLLYQAKLALDEDLRLKVVR-----KMFELRFGE 126 G A G + ++ Q L RL + Sbjct: 71 LCGPN---HNAVGMLLPLDGHHVQAKRIEAQIAAGLPMHKRLWAAVVKSKLEQQAAALEA 127 Query: 127 PAPARRSVEQL---------RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTIN 177 A ++ L IEG R + LL +G + + GD +N Sbjct: 128 VAAPTAPLQALVAKVRSGDPENIEGQGARRYWGLL---FGAEFRR-------DQSGDGLN 177 Query: 178 QCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEI 235 ++ + + A++AAG P+IG H+ + V D+ + + V K +++ Sbjct: 178 ALLNYGYTIVRSACARAVVAAGLHPSIGLHHSNDANAMRLVDDLMEPFRP-IVDLKVWQL 236 Query: 236 ARRNPGEPDREVRLACRDIFRS 257 + + + A + Sbjct: 237 HKAGESHVTPDTKRALVRVLYD 258 >UniRef50_A3DLB7 CRISPR-associated protein, Cas1 family n=1 Tax=Staphylothermus marinus F1 RepID=A3DLB7_STAMF Length = 341 Score = 54.2 bits (129), Expect = 6e-06, Method: Composition-based stats. Identities = 46/273 (16%), Positives = 84/273 (30%), Gaps = 50/273 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + G V++ G R PV V +++ +S ++L A G ++ G Sbjct: 19 LRKKHGRIVVVS-RGGREEFPVRRVREVIISGKAGISTELLKLLADSGIDVLVTSYTGRP 77 Query: 85 --VYASGQPGGARSDKLLYQAKLALDEDLRLKVV-------------RKMF---ELRFGE 126 ++ + G + Q + D K + + Sbjct: 78 VALFVHARSG-GSVRNRIEQYRSLEDGRACKAAAMIISGKLSNQVSNLKYYSKPRKNISQ 136 Query: 127 P--------APARRSVEQLRGIE----GSRVRATYALLAKQYGVTWNG-----------R 163 ++ E+LR IE G ++ A+ W+G Sbjct: 137 ESKILYEKAEEIKKLREKLRDIETSDLGKCRENIMSIEAQAANTYWDGMRIVLGKYGFKE 196 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP--LSFVYDIAD 221 R ++ D +N C++ + L G +L P GF+H +P LS VYD+ + Sbjct: 197 RVKRGRGKEVDPVNLCLNIIYNKLAGTVWKYVLRFSLDPFQGFLHARRPGKLSLVYDLME 256 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDI 254 + P A R D+ Sbjct: 257 PFR-----PIADRFIARFLYRLDQGFLRRASGA 284 >UniRef50_A7BYC5 Protein containing DUF48 n=1 Tax=Beggiatoa sp. PS RepID=A7BYC5_9GAMM Length = 322 Score = 53.9 bits (128), Expect = 6e-06, Method: Composition-based stats. Identities = 45/262 (17%), Positives = 84/262 (32%), Gaps = 44/262 (16%) Query: 18 IFLQY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 +F+ + + + G P+ + ++L + ++ + L Q G L Sbjct: 4 LFISRDATLKRRENTLA-VTVGGKTKPFPIEKIRHLVLLGESSLNTKLLTLCGQNGVRLS 62 Query: 77 WVGEAGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVVR--------------KM 119 G + L QAKL LD++ R+ + R Sbjct: 63 IFDYYG--YFKGAFEPIEQNGSGRVKLAQAKLILDQEQRMAIAREIVRGAAHNMRANLAY 120 Query: 120 FELRFGEPAPARRSVE---------------QLRGIEGSRVRATYALLAK-QYGVTWNGR 163 ++ R G A ++ + E +L G EG + +A A + + R Sbjct: 121 YQYR-GNKALSKSTQEITKLMDRLHFAKDSDELMGFEGQITQTYFAAWALIDQRLDFLPR 179 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIAD 221 P + + IN IS Y VT + F+HT S D+++ Sbjct: 180 VRRPPN----NPINCLISFINQLTYTVTRHEAFKTHLEETLSFLHTPSTGRSSLSLDLSE 235 Query: 222 IIKFDTVVPKAFEIARRNPGEP 243 K ++ R+N + Sbjct: 236 PFKPVLSHGLIIKMVRKNMVDD 257 >UniRef50_B5YJS2 Crispr-associated protein Cas1 n=1 Tax=Thermodesulfovibrio yellowstonii DSM 11347 RepID=B5YJS2_THEYD Length = 318 Score = 53.9 bits (128), Expect = 7e-06, Method: Composition-based stats. Identities = 41/253 (16%), Positives = 84/253 (33%), Gaps = 47/253 (18%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +F+ +I V + K +P+ + +++ ++ + + G Sbjct: 1 MSTVFIDRKDIEIRVDGNSISFYAKGKKDGSLPLSPLKRVVIVGNVKIETSVLYKLVNHG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLY--------------------QAKLALDED-- 110 ++++ G Y+ G ++ LL + K+ D Sbjct: 61 ITVLFL--TGKLKYSGILNGPLHNNGLLRVKQYQKSLSGFSLKFAKELIKRKIVSQRDFL 118 Query: 111 ---LRLKVVRKMFELR--------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--G 157 +K M R S++ LRGIEG+ + +K + Sbjct: 119 SEIREIKKALAMQADRAIEILNKAISNIEVTPISIDSLRGIEGAASSIYFITYSKIFPNS 178 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLS 214 + + R P D +N +S + L+ I G P IGF H G+ S Sbjct: 179 LKFVRRIKRPPK----DPVNAMLSLCYTLLHYEIVREIQLIGLDPTIGFYHQFEYGRE-S 233 Query: 215 FVYDIADIIKFDT 227 D+ ++ + + Sbjct: 234 LACDLVELFRVNV 246 >UniRef50_C0A724 CRISPR-associated Cas1/Cas4 family protein n=1 Tax=Opitutaceae bacterium TAV2 RepID=C0A724_9BACT Length = 691 Score = 53.9 bits (128), Expect = 8e-06, Method: Composition-based stats. Identities = 47/303 (15%), Positives = 86/303 (28%), Gaps = 85/303 (28%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV- 83 + + +K + I + ++ + + +S A + A+ + + AG Sbjct: 329 LLKKSELIQIREKGELVNEIRIKDLSHVAIFGSATISTALLNELAERDIAVSYFSSAGTL 388 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------LRFGEPAPARR- 132 R Y G + + Q + A L++ R + +R PA Sbjct: 389 RAYTRGPSLKNVFTR-IAQFRAADTPATALRIARLFVQGKIRNQRTLIMRNHAMPPASTL 447 Query: 133 --------------SVEQLRGIEGSRVRATYALLA------------------------- 153 S+E+L GIEGS A + + Sbjct: 448 GRLQHAITAAANTESIEELLGIEGSAALAYFQEFSGMIKTTGDDILDAIAEGREPEPGSL 507 Query: 154 -------------------------KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLY 188 + + + R P D +N +S A S L Sbjct: 508 SDTTQAPDITTGKKRRSGKRDTPGQEFFSFDFTRRNRRPPR----DAVNALLSLAYSILA 563 Query: 189 GVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 +A A G+ P +GF H G+P + D+ + + + P Sbjct: 564 KDCTSAAHAVGFDPYVGFYHQPRFGRP-ALALDLMEEFRPLVADSVVLTLINTRMISPTD 622 Query: 246 EVR 248 VR Sbjct: 623 FVR 625 >UniRef50_C6MJ62 CRISPR-associated protein Cas1 n=5 Tax=Nitrosomonas sp. AL212 RepID=C6MJ62_9PROT Length = 430 Score = 53.5 bits (127), Expect = 1e-05, Method: Composition-based stats. Identities = 22/110 (20%), Positives = 43/110 (39%), Gaps = 15/110 (13%) Query: 133 SVEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYDP-----KDWEKGDTINQC 179 ++E++R IE ++ K + W G K ++ +N Sbjct: 251 TLEEMRLIEARAAIIYWSTWNNCVIKWKEKDVPIEWRGFSQRASGISGKGYKATHPVNAI 310 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 ++ A + L G E A+ G A+G +H + S VYD+ + ++ Sbjct: 311 LNYAYAILAGQVERALQIVGLDVAVGSLHADQDGRASLVYDLMEPLRPVI 360 >UniRef50_Q7MTH7 CRISPR-associated protein Cas1 n=3 Tax=Porphyromonas gingivalis RepID=Q7MTH7_PORGI Length = 1031 Score = 53.1 bits (126), Expect = 1e-05, Method: Composition-based stats. Identities = 37/284 (13%), Positives = 79/284 (27%), Gaps = 61/284 (21%) Query: 45 PVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQA 103 P + I ++ G +S + + +++ G + + Q Sbjct: 729 PAAQIEQISIISDGVSLSSNVTKYCRKKNIRVIFYNATGQAYASLNGMNTILPSVMEAQM 788 Query: 104 KL----------------ALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEG----- 142 +L ++ L+ K + P ++ +L+ +EG Sbjct: 789 RLSEEKKREFILTLIKNKVRNQGKLLRYYHKYYRHDKELKEPLSNAIAELKQLEGIPIAE 848 Query: 143 ---------------SRVRATY----ALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 +R Y ALL + G + GR + +NQ ++ Sbjct: 849 GSSLADFRQHAMLHEARCAQVYWRAFALLVHRSGHEFEGREHK----GAEGLVNQMLNYG 904 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFEIARRNP- 240 + L I+ P IG +H+ + +D+ + + V + + Sbjct: 905 YAILRSYVMKTIVLWQLNPNIGILHSTQDNKPALCFDLMEQYRAFVVDRSILALLAKGED 964 Query: 241 ------GEPDREVRLACRDIFRSS-------KTLAKLIPLIEDV 271 G D R ++ KL I + Sbjct: 965 VGQNSKGLLDMPTRSRIISKINERWFATEYYRSGEKLFSDIMKL 1008 >UniRef50_B5ZLL1 CRISPR-associated protein Cas1 n=2 Tax=Gluconacetobacter diazotrophicus PAl 5 RepID=B5ZLL1_GLUDA Length = 297 Score = 53.1 bits (126), Expect = 1e-05, Method: Composition-based stats. Identities = 38/253 (15%), Positives = 81/253 (32%), Gaps = 29/253 (11%) Query: 30 GAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGEAGVRVYAS 88 +++ + G + V +AC++L+ ++ + + A+ G ++ R + + Sbjct: 18 NRQLVVAQDGGEVSLAVEDIACLILDTRQVSITGSLLSALAENGVAMIV---PDARHHPA 74 Query: 89 GQPGGARSDKLLYQA-------------KLALDEDLRLKVVRKMFELRFGEPA--PARRS 133 G +L L V + + + P ++ Sbjct: 75 GILLPFHQHHAQAHIAHAQISISQPLKKRLW----QTLVVAKIRNQAALLDQLGRPQGQT 130 Query: 134 VEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEA 193 + + G S A + W D + D N ++ + + Sbjct: 131 IAAMAGRVASGDPGNVEAQAARAY--WASLFSDFTRANENDRRNALLNYGYAIMRAAIAR 188 Query: 194 AILAAGYAPAIGFVHTGKPLSF--VYDIADIIKFDTVVPKAFEIARRNPGEPDR-EVRLA 250 A +A G PA G H K +F V D+ + + V A + A + G+ E R Sbjct: 189 ACVALGLLPAFGVHHASKTNAFNLVDDLIEPFRP-FVDRMAHDRALEHVGDTLSIEDRRQ 247 Query: 251 CRDIFRSSKTLAK 263 I + + + Sbjct: 248 MSTILNDNAAIGR 260 >UniRef50_A1BI46 CRISPR-associated protein Cas1 n=2 Tax=Chlorobiaceae RepID=A1BI46_CHLPD Length = 347 Score = 52.7 bits (125), Expect = 1e-05, Method: Composition-based stats. Identities = 38/206 (18%), Positives = 71/206 (34%), Gaps = 32/206 (15%) Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYA 150 L A+ + E LR ++ +L E S++ +RG EGS + Sbjct: 122 KLHNSGVLLRRHAESSGSEALR-HAATQLRQLE--EHVDRADSIDAVRGYEGSGAATYFG 178 Query: 151 LLAKQY---GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 + + G + R P D +N +S S L+ + P +GF+ Sbjct: 179 VFEDFFDTGGFIFRERVKRPP----TDPVNAMLSFGYSLLFNNIFSMARLHRLHPYVGFL 234 Query: 208 HTGKPL--SFVYDIADIIKF------------DTVVPKAFEIARRNPGEPDREVRLACRD 253 H KP + V D+ + + + P+ F +AR + G+P + Sbjct: 235 HADKPAHPALVSDLIEEFRTLVDGLVIALINKRLISPEEFTVARHDDGKP--------KG 286 Query: 254 IFRSSKTLAKLIPLIEDVLAAGEIQP 279 + S + E+++ P Sbjct: 287 CYLSDGARKTFLREFENLMHRTTTHP 312 >UniRef50_UPI0001BCCAFD hypothetical protein SnoxA4_00467 n=1 Tax=Selenomonas noxia ATCC 43541 RepID=UPI0001BCCAFD Length = 281 Score = 52.7 bits (125), Expect = 2e-05, Method: Composition-based stats. Identities = 20/123 (16%), Positives = 39/123 (31%), Gaps = 4/123 (3%) Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 + P + QL G EG+ + + L + + + D N +S Sbjct: 88 LSKHIPRCETNTQLMGYEGAIAKVYFRALGLLVPEAFAFVKRSRRPPM--DPFNTMLSFG 145 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPG 241 + L + G P GF+H K + D+ + + V + + Sbjct: 146 YTLLMYDLYTVVNNEGLHPYFGFLHALKNRHPALASDLMEEWRPVLVDAMVLSLVHHHEM 205 Query: 242 EPD 244 P+ Sbjct: 206 RPE 208 >UniRef50_C8W3G7 CRISPR-associated protein Cas1 n=24 Tax=Bacteria RepID=C8W3G7_DESAS Length = 335 Score = 52.3 bits (124), Expect = 2e-05, Method: Composition-based stats. Identities = 44/265 (16%), Positives = 83/265 (31%), Gaps = 45/265 (16%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA- 81 G + D + + G IP+ + +S ++L A+ G ++ + G Sbjct: 14 GDLYQKDFSIAFRKEDGNFY-IPIKDTRELYCFNDITLSTKLLQLLAKAGIVVHFFGYYE 72 Query: 82 ---GVRVYASGQPGGARSDKLLYQAKLALDEDLRLK------VVRKMFEL-----RFG-- 125 G Y + + QA L++ + + + + R G Sbjct: 73 NYIGT-FY--PKEQLLSGRLTVAQALAYEQNRLQIAGQIIKGIAKNTYFVLYHYYRHGKS 129 Query: 126 ---------EPAPARR-----SVEQLRGIEG---SRVRATYALLAKQYGVTWNGRRYDPK 168 +R +++QL IEG +R ++ + + N R P Sbjct: 130 ELKDFLDWLRKDVSRLVDSVGNIKQLLRIEGEIWARFYQSFRVFLPE-SFAMNKRVKRPP 188 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFD 226 D + IN IS + LY T I I F+H S D++++ K Sbjct: 189 D----NPINALISFGNTLLYTKTITQIFHTHLNQTISFLHEPAERRFSLSLDLSEVFKPV 244 Query: 227 TVVPKAFEIARRNPGEPDREVRLAC 251 V F+ ++ Sbjct: 245 LVCKTIFDCVNNRKIMVEKHFDKKL 269 >UniRef50_C5EZ73 Crispr-protein cas1 n=1 Tax=Helicobacter pullorum MIT 98-5489 RepID=C5EZ73_9HELI Length = 230 Score = 51.9 bits (123), Expect = 2e-05, Method: Composition-based stats. Identities = 23/88 (26%), Positives = 39/88 (44%), Gaps = 10/88 (11%) Query: 131 RRSVEQLRGIEG-SRVR--ATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCL 187 ++++QL GIEG V T+ +AK R + D +N + A + L Sbjct: 145 AKNIKQLMGIEGKCAVLYWNTFRHMAK-------FRGFHRIKRNAKDVLNASFNYAYAIL 197 Query: 188 YGVTEAAILAAGYAPAIGFVHTGKPLSF 215 +G +++I+ A P I F+H S Sbjct: 198 HGSIQSSIIKAELNPHISFLHIQNSKSL 225 >UniRef50_A7HP88 CRISPR-associated protein Cas1 n=1 Tax=Parvibaculum lavamentivorans DS-1 RepID=A7HP88_PARL1 Length = 311 Score = 51.9 bits (123), Expect = 2e-05, Method: Composition-based stats. Identities = 39/258 (15%), Positives = 93/258 (36%), Gaps = 28/258 (10%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ V + V+ + +P+ + ++++ + A + G ++ G Sbjct: 14 RLSVANKQLVIERPDLPKATLPIEDLGVVIVDDLRATYTQAVFIELLEAGATVMVTGRDH 73 Query: 83 VRVYASGQPGGARSDKLL---YQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG 139 + +G + + ++A++ E + + + + + + + G Sbjct: 74 LP---AGMMLPLDAHHIQTERHRAQVEASEPTKKRAWQALIRSKIAQQ---GIVLAHFTG 127 Query: 140 IEG------SRVRA-----TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLY 188 G RVR+ A A++Y G+ + +G +N ++ + + Sbjct: 128 EHGGLLPMARRVRSGDPDNLEAQAAQRYWPRLFGKDFRRDRDLEG--VNALLNYGYAVVR 185 Query: 189 GVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDT---VVPKAFEIARRNPGEP 243 T A +AAG P++G H + F D+ + + V A ++ P Sbjct: 186 AATARATVAAGLIPSLGVFHRNRANPFCLADDLLEPYRPYVDWRVRLLANQMGEEAPSLD 245 Query: 244 DREVRLACRDIFRSSKTL 261 DR+ R A IF + + Sbjct: 246 DRDTRAALLSIFNETVLV 263 >UniRef50_C7V674 Predicted protein n=2 Tax=Enterococcus faecalis RepID=C7V674_ENTFA Length = 304 Score = 51.9 bits (123), Expect = 2e-05, Method: Composition-based stats. Identities = 40/244 (16%), Positives = 75/244 (30%), Gaps = 33/244 (13%) Query: 37 KTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA--------GVRVY- 86 K G IP+ + + I+ T ++ + AQ V G+ Y Sbjct: 25 KEGNTYVIPLTDIESVILEGDQTVITTRLLAKFAQHHIDTVICDNTFMPVGVFLGIGQYH 84 Query: 87 -----ASGQPGGARSDKLLYQAKLALDE-DLRLKVVRKMFELRFGEPAPARRSVEQLRG- 139 A Q K + ++ + ++ V + + + S L G Sbjct: 85 RSAKRAIWQSNWTEEHKQVAWCEIVTQKIQNQIAVAKYLGTDSERVEVLEKLSEGILPGD 144 Query: 140 ---IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAIL 196 EG + + L YGV + E+ N C++ + + ++ Sbjct: 145 TTNREGHVAKVYFHSL---YGVGFTR--------EEECLPNACMNYGYAVIRAQMARCVV 193 Query: 197 AAGYAPAIGFVHTGKPLSF--VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDI 254 A G P +G H + SF V D+ + + I ++N RL + Sbjct: 194 ALGLLPMLGIFHKNEYNSFNLVDDLMEPFRPLMDWYIHQTILKKNEKYLTYHSRLTLVEF 253 Query: 255 FRSS 258 Sbjct: 254 LHQK 257 >UniRef50_Q13CC1 CRISPR-associated protein, Cas1 family n=2 Tax=Rhodopseudomonas palustris RepID=Q13CC1_RHOPS Length = 299 Score = 51.5 bits (122), Expect = 3e-05, Method: Composition-based stats. Identities = 48/274 (17%), Positives = 93/274 (33%), Gaps = 39/274 (14%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTR 59 M W L+ Q ++ + D + + + + +A I+++ P Sbjct: 1 MAWRGLHLT-----------QAARLSLADSQVC-VKQDAGEVRLALEDIAWIVIDTPQAT 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGG-----ARSDKLLYQA--------KLA 106 ++ A + + G +LV+ E SG + Q +L Sbjct: 49 LTSALMSACMEAGIVLVFTDERHTP---SGMALPFHRHHRQGGIARLQMDAKDGVKKRLW 105 Query: 107 LDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 +R K++ + L + A E R +E A A+ Y W D Sbjct: 106 -QAIIRRKILNQAGSLAVLDRNNAETLREIARHVEPGDPENVEARAARFY---WGRLFED 161 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIK 224 + GD N+ ++ + + A++A+G+ PA G H G +F D+ + + Sbjct: 162 FVRDDDGDLRNKMLNYGYAVVRAGVARALVASGFLPAFGLKHDGAANAFNLADDLVEPFR 221 Query: 225 FDTVVPKAFEIAR---RNPGEPDREVRLACRDIF 255 V A++ G+ E R A + Sbjct: 222 P-FVDVLAWKTLGDRVDRKGDLTLEDRRAMAGVL 254 >UniRef50_Q6A5T6 Putative uncharacterized protein n=1 Tax=Propionibacterium acnes RepID=Q6A5T6_PROAC Length = 112 Score = 51.5 bits (122), Expect = 4e-05, Method: Composition-based stats. Identities = 12/40 (30%), Positives = 20/40 (50%) Query: 35 IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 +G H+P + ++L PGT V+H A+ L + G Sbjct: 5 HQTSGGTVHVPASIIGALLLGPGTNVTHQAMVLLTESGAA 44 Score = 44.6 bits (104), Expect = 0.004, Method: Composition-based stats. Identities = 14/51 (27%), Positives = 18/51 (35%) Query: 235 IARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPED 285 A + + R A RD F SK L K + I +L A D Sbjct: 43 AATGDIDDLAGVTRRAMRDAFHESKLLTKCVQDIHVLLDAPGDPDEGYGWD 93 >UniRef50_Q03LF6 CRISPR-associated protein, Cas1 family n=6 Tax=Streptococcus RepID=Q03LF6_STRTD Length = 303 Score = 51.2 bits (121), Expect = 4e-05, Method: Composition-based stats. Identities = 31/227 (13%), Positives = 78/227 (34%), Gaps = 33/227 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVG 79 + ++ + + +L+ K G +P+ ++ I+ E G T V+ + ++ LV Sbjct: 12 EKMRLKLDN---LLVQKMGQEFTVPLSDISIIVAEGGDTVVTLRLLSALSKYNIALVVCD 68 Query: 80 EAGVR--VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA--RRSVE 135 + +Y S +L Q + + + + +++ E A +S++ Sbjct: 69 NEHLPTGIYHSQNGHFRAYKRLKEQLDWSQKQKDKAWQIVTYYKINNQEDVLAMFEKSLD 128 Query: 136 QLR---------------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 +R EG + + L G+++ ++ D IN + Sbjct: 129 NIRLLSDYKEQIEPGDRTNREGHAAKVYFNEL--------FGKQFVRVTQKEADVINAGL 180 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKF 225 + + + + G +G H + V D+ + + Sbjct: 181 NYGYAIMRAQMARIVAGYGLNGLLGIFHKNEYNQFNLVDDLMEPFRQ 227 >UniRef50_B1X158 DUF48-containing protein n=12 Tax=Cyanobacteria RepID=B1X158_CYAA5 Length = 330 Score = 51.2 bits (121), Expect = 5e-05, Method: Composition-based stats. Identities = 44/309 (14%), Positives = 91/309 (29%), Gaps = 49/309 (15%) Query: 18 IFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ Q + + + K + I + + +++ ++++ A+R + Sbjct: 4 LYVSQQGCYVKLDQETLKVEKKRQLLAEIQLPLIEQVLIFGKSQMTTQAIRACLWRNIPI 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLY-QAKLALDEDLRLKVVRKM--------------F 120 V++ G Y L Q +L D RL R++ Sbjct: 64 VYLSRMGY-CYGRIMSLKRGYRHLTRYQQQL--DFSQRLLTARELVKAKLKNCRVILQRQ 120 Query: 121 ELRFG---------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYG---VTWNG 162 + R E A ++E+L G EG ++ + Sbjct: 121 QRRLQSDKLLFAIDSLNYLIEQANLAETIERLMGFEGVGASTYFSAFGDCLTPSEFIFLA 180 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIA 220 R P G+ +N +S L+ A I G P +H G + D+ Sbjct: 181 RSRRPP----GNPVNAMLSFGYQVLWNHLLALIELQGLDPYQACLHQGSERHAALASDLI 236 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFR----SSKTLAKLIPLIEDVLAAGE 276 + + V + R D + + K L + +E+ + Sbjct: 237 EEFRAPMVDSLVLYLVNRRIMNADDDFEYHDGGCYLNNTGRKKYLKHFVQRMEETVQTTP 296 Query: 277 IQPPAPPED 285 + P D Sbjct: 297 DEKQ-PRWD 304 >UniRef50_B2KB47 CRISPR-associated protein Cas1 n=2 Tax=Elusimicrobia RepID=B2KB47_ELUMP Length = 298 Score = 50.4 bits (119), Expect = 8e-05, Method: Composition-based stats. Identities = 31/223 (13%), Positives = 74/223 (33%), Gaps = 38/223 (17%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + V + F + + H + I+L +S+ ++ + +++ + Sbjct: 13 HLCVKNNNFSAVKDREEKLHCLFDDINSIILYGNNITISNTCIQKCLEHKVPVIFCDKT- 71 Query: 83 VRVYASGQPGGARS-----DKLLYQA--------KLALD-----EDLRLKVVRKMFELRF 124 +G + + +L Q + + + +V+++ L+ Sbjct: 72 --YNPAGMLLSSFTTNIYGRRLQLQINASKPQIKQAWQQIITSKLNNQAEVLKRFDTLK- 128 Query: 125 GEPAPARRSVEQLRG----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 + E G EG + + L + ++ + D IN + Sbjct: 129 AAETIFNMAREVRSGDATFKEGVGAKVYFENLFNDFH----------RNTDDKDIINSAL 178 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADII 223 + + + A+++AG PAIG H+ F I D+I Sbjct: 179 NYGYAIVRSSIARAVVSAGLNPAIGIFHSKNHNPFCL-IDDLI 220 >UniRef50_B1L400 CRISPR-associated protein Cas1 n=2 Tax=Archaea RepID=B1L400_KORCO Length = 341 Score = 50.0 bits (118), Expect = 1e-04, Method: Composition-based stats. Identities = 50/261 (19%), Positives = 93/261 (35%), Gaps = 54/261 (20%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW---VGEA 81 + G +++ K G + IP+ SV +++ +S ++ AQ GT L+ G Sbjct: 18 LRKRRGRILILSK-GEKKEIPMKSVKEVVIIGKAALSSELLKALAQSGTDLLIATPTGRP 76 Query: 82 GVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK--------------MF--ELRFG 125 R+ + G AR+ Y++ L++ +++ R + R Sbjct: 77 VARLIPAKAGGTARNRYEQYKS---LEDRRGIEIARAVIVGKIRNQASNLSYYSKARRMD 133 Query: 126 EPAPARR---------SVEQLRGIE----GSRVRATYALLAKQYGVTW------------ 160 E + +E+L+ E + A +K + W Sbjct: 134 EELSSELYDAAQQLKREMEELKNEEFPDIDEARKRIMARESKCANIYWEKIASIMEEWKF 193 Query: 161 NGRRYDPKDWEKG--DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFV 216 GR D E D +N C++ + L +L G P +G++H +P S V Sbjct: 194 RGRE-KRTDLEGNVIDPVNLCLNVCYNLLSAQIWKNVLRFGLDPFLGYLHVERPGRISLV 252 Query: 217 YDIADIIKFDTVVPKAFEIAR 237 YD+ + + V F R Sbjct: 253 YDLMEPFRP-MVDRFVFSYLR 272 >UniRef50_B9M5J4 CRISPR-associated protein Cas1 n=9 Tax=Bacteria RepID=B9M5J4_GEOSF Length = 344 Score = 50.0 bits (118), Expect = 1e-04, Method: Composition-based stats. Identities = 21/109 (19%), Positives = 44/109 (40%), Gaps = 4/109 (3%) Query: 134 VEQLRGIEGSRVRATYALLAK--QYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 ++Q+RG+EG A +A+ + + G + D +N +S + + Sbjct: 159 IDQVRGLEGESAAAYFAVFDQMVKDGDRAAFAMDNRNRRPPLDPMNALLSFLYTLVLNDC 218 Query: 192 EAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARR 238 +A+ + G +GF+H +P S D+ + + A + R Sbjct: 219 ISAVESVGLDSQMGFLHALRPGRPSLGLDLMEEFRAVIADRLALTLINR 267 >UniRef50_B4AQ39 Crispr-associated protein Cas1 n=5 Tax=Francisella RepID=B4AQ39_FRANO Length = 334 Score = 50.0 bits (118), Expect = 1e-04, Method: Composition-based stats. Identities = 25/102 (24%), Positives = 38/102 (37%), Gaps = 11/102 (10%) Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGRRYDPKDWEKGDTINQCISAAT 184 S+ +L GIEG+ + + YG +W GR+ K D N + Sbjct: 151 ELENTTSLAELMGIEGNVAKNFFKGF---YGHLDSWQGRKPRIKQ----DPYNVVLDLGY 203 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK 224 S L+ E + G+ GF H K S V D + + Sbjct: 204 SMLFNFVECFLRLFGFDLYKGFCHQTWYKRKSLVCDFVEPFR 245 >UniRef50_Q5LZX6 Putative uncharacterized protein n=2 Tax=Streptococcus thermophilus RepID=Q5LZX6_STRT1 Length = 207 Score = 49.6 bits (117), Expect = 1e-04, Method: Composition-based stats. Identities = 35/211 (16%), Positives = 73/211 (34%), Gaps = 42/211 (19%) Query: 15 VSMIFLQYGQ--IDVIDGAFVLI-DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S ++ Q + + + ++ D I + + V ++L +++ ++ ++ Sbjct: 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSD--KLLYQAKLALDEDLRLKVVRKM---------- 119 + + G + S + + K QAK +ED RL+V R + Sbjct: 61 KVNVYYFSNVG--QFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHPIA 118 Query: 120 ---------------FELRFGE---PAPARRSVEQLRGIEGSRVRA--TYALLAKQYGVT 159 + RF + S+ ++ G EG ++ Y L Sbjct: 119 LLREFDTDGLLDTSDYS-RFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPNDFH 177 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 +NGR P + D N ++ S LY Sbjct: 178 FNGRSRRPGE----DCFNSALNFGYSILYSC 204 >UniRef50_A2SQK9 Uncharacterized protein predicted to be involved in DNA repair-like protein n=1 Tax=Methanocorpusculum labreanum Z RepID=A2SQK9_METLZ Length = 310 Score = 49.6 bits (117), Expect = 1e-04, Method: Composition-based stats. Identities = 44/282 (15%), Positives = 87/282 (30%), Gaps = 44/282 (15%) Query: 33 VLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRV---YASG 89 +++ + G T P+ + +++ G + + + A G + + G V Y G Sbjct: 21 LIVRQKGSTTQYPLDDMRHLLIAGGHSLHTSVLERLADRGIAVSFFTAHGKPVGGIYGKG 80 Query: 90 QPGGARSDKLL-----YQAKLALDEDLRLKVVRKMFELRFGEPAPARR-----------S 133 P A + + A + D RL+ + E P + Sbjct: 81 APSLASQQRDIPIHKFAMASIRSSLDERLR-----YINELAEFDPEGLYFKGEFDILTAA 135 Query: 134 VEQLR--------GIEGSRVRA-TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 E+L G S + Y ++ ++ RR D +N +S Sbjct: 136 REELEYLITLPEIGRAFSLTKTMYYEIIGRKLPKVLGYRR--RCQPPFMDPVNVMMSHGY 193 Query: 185 SCLYGVTEAAILAAGYAPAIGFVH------TGKPLSFVYDIADIIKFDTVVPKAFEIARR 238 + LY A AG + G ++ G V D+ + V ++A Sbjct: 194 AVLYANFALACTGAGLDLSRGALYGEIVSAPGGRGGCVLDLMEPATVSMVDRVIIQMAAE 253 Query: 239 N--PGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 G + R + + + + +L I L + Sbjct: 254 GRLDGAYEVTTRCLLSNELKE-EFMKRLHGSINITLIEENVN 294 >UniRef50_C7XMU1 CRISPR-associated protein cas1 n=2 Tax=Fusobacterium RepID=C7XMU1_9FUSO Length = 292 Score = 49.2 bits (116), Expect = 2e-04, Method: Composition-based stats. Identities = 36/266 (13%), Positives = 88/266 (33%), Gaps = 28/266 (10%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGE 80 ++D+ + + G I +G V ++LE T ++ A + + +++ E Sbjct: 12 RSKLDLRYNSISIRRDNGTDF-IHIGEVNTLILETTTISITAALMCELIKQKVKVIFCDE 70 Query: 81 AGVRVY-ASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG 139 + G + + D ++ + E + + + Sbjct: 71 KSNPHFELLPFYGSHDCSAKIKEQIAWTDFLK-----ESLWTIIVTEKIENQMKLLKKLN 125 Query: 140 IEGSRVRATYALL----------AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 E ++ YA + ++ + K +++N ++ L Sbjct: 126 KEEYKILQEYASQIEHNDNTNREGHSAKIYFSALFGNNFSRNKENSLNAFLNYGYQLLLS 185 Query: 190 VTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREV 247 I+A GY IG H + D+ + + V A+ + NP + +++ Sbjct: 186 TFNKEIVANGYLTQIGIFHKNMFNYYNLSSDLMEPFRV-IVDELAY---KENPQKFEKDE 241 Query: 248 RLACRDI----FRSSKTLAKLIPLIE 269 + ++I FR + L +I+ Sbjct: 242 KRKLQNILNLKFRINNVNHYLSDIIK 267 >UniRef50_D1VVR4 CRISPR-associated endonuclease Cas1, DVULG subtype n=1 Tax=Peptoniphilus lacrimalis 315-B RepID=D1VVR4_9FIRM Length = 343 Score = 49.2 bits (116), Expect = 2e-04, Method: Composition-based stats. Identities = 24/98 (24%), Positives = 41/98 (41%), Gaps = 5/98 (5%) Query: 135 EQLRGIEGSRVRATYALLAKQYGVTWNGRRY--DPKDWEKGDTINQCISAATSCLYGVTE 192 + +RGIEG+ R +++L ++ V Y + D N +S S L Sbjct: 159 DSIRGIEGTIARQYFSVL-DEFIVKQREDFYFIERTKRPPRDRFNAMLSFMYSILTNSIA 217 Query: 193 AAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTV 228 +A+ G GF HT +P S DI + ++ + Sbjct: 218 SALEGVGIDSYAGFFHTDRPGRVSMALDIIEEMRAFII 255 >UniRef50_A8TI31 CRISPR-associated protein Cas1 n=1 Tax=Methanococcus voltae A3 RepID=A8TI31_METVO Length = 235 Score = 48.8 bits (115), Expect = 2e-04, Method: Composition-based stats. Identities = 29/123 (23%), Positives = 45/123 (36%), Gaps = 15/123 (12%) Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 + R P + E IN IS S LY + I P++ ++H S Sbjct: 81 FKYEKRSRRPPENE----INALISFGNSLLYSTVISEIFNTHLNPSVSYLHEPYERRYSL 136 Query: 216 VYDIADIIKFDTVVPKAFEIARR---NPGEPDREVRLAC-----RDIFRSSKTLAKLIPL 267 DIAD+ K V F + + N ++++ R IF S K +L Sbjct: 137 ALDIADVFKPIFVDRLIFNLVNKKIINENHFEKDLNSCLLNDEGRAIFLS-KYNERLQKT 195 Query: 268 IED 270 I+ Sbjct: 196 IKH 198 >UniRef50_P71636 CRISPR-associated protein Cas1 n=11 Tax=Mycobacterium tuberculosis complex RepID=P71636_MYCTU Length = 338 Score = 47.7 bits (112), Expect = 4e-04, Method: Composition-based stats. Identities = 23/114 (20%), Positives = 39/114 (34%), Gaps = 8/114 (7%) Query: 133 SVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 S+ +L G EG+ +A + L + GR P D N +S S LY Sbjct: 147 SLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRSTRPP----LDAFNSMVSLGYSLLYKN 202 Query: 191 TEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGE 242 AI IGF+H + D+ ++ + + + + Sbjct: 203 IIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAPIIDDTVLRLIADGVVD 256 >UniRef50_B2UP48 CRISPR-associated protein Cas1 n=1 Tax=Akkermansia muciniphila ATCC BAA-835 RepID=B2UP48_AKKM8 Length = 311 Score = 47.7 bits (112), Expect = 5e-04, Method: Composition-based stats. Identities = 31/222 (13%), Positives = 61/222 (27%), Gaps = 26/222 (11%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWV--- 78 + G D IP+ V ++L ++ + A+ V Sbjct: 13 CHLSCDKGQLRCADGENSPRTIPLEDVGAVVLSSFKATLTSNLLIELARKRIGFVLCESY 72 Query: 79 ------------GEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 + G+ + + P R+ L E Sbjct: 73 RPAVLLLPADRSTDTGLLRHLADMPARLRNRLWQKTLDAKCGNQTALAQAWNPHHPAIAE 132 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + + R E +R+ +++ A TW + E+G N + A + Sbjct: 133 LKRMAVTEKTAREAECARL--FWSVFAD----TWANSDFRRGRHEEG--FNNLFNYAYAI 184 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFD 226 L + A G P G H + + YD+ + + Sbjct: 185 LLSCILQYLFALGLDPCFGIFHQSREHAAPLAYDLMEPFRPA 226 >UniRef50_UPI00016B206F hypothetical protein cdiviTM7_00753 n=1 Tax=candidate division TM7 single-cell isolate TM7c RepID=UPI00016B206F Length = 296 Score = 47.7 bits (112), Expect = 6e-04, Method: Composition-based stats. Identities = 37/261 (14%), Positives = 90/261 (34%), Gaps = 35/261 (13%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEA 81 ++ + D V+ +T +P+ + ++L G + + A GT + E Sbjct: 12 ARLSLRDNQLVIAQETEAT--LPIEDIDSLILDGYGITTTTNLLAALATKGTTTIICDEK 69 Query: 82 GVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQ--LR- 138 + AS ++ QAK+ ++ + + + + + + ++ + + LR Sbjct: 70 HLP--ASVLLPYSQHS---RQAKV---SRQQIAMSQPLKKQLWQQIIISKITNQADVLRS 121 Query: 139 -GIEGSRVRATYALLAKQYGVTWNGRRYDPK---------DWEKGDTI--NQCISAATSC 186 G++ + +R ++ + R D + I N ++ + Sbjct: 122 TGLDDAALRT---HISDVKSGDTSNRESIAARIYFDQLLDDATRRKPIWHNTALNYGYAM 178 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPD 244 + I A G + G H + SF D+ + + + ++A + G+ D Sbjct: 179 VRSHIARHIAARGLVASQGIFHHNELNSFNLADDLIEPYRAAVDLYVLEKVAPLHVGDRD 238 Query: 245 ----REVRLACRDIFRSSKTL 261 + R DI + Sbjct: 239 ASLTKHDRQLIIDILNYYVIM 259 >UniRef50_A8LN06 CRISPR-associated protein Cas1 n=2 Tax=Rhodobacterales RepID=A8LN06_DINSH Length = 303 Score = 47.7 bits (112), Expect = 6e-04, Method: Composition-based stats. Identities = 41/255 (16%), Positives = 75/255 (29%), Gaps = 18/255 (7%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSV-ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + G + + IP+ + I+ GT + + + A G +V G Sbjct: 13 HLSRDRGFLKVSEGAREIGRIPLDQIAGVIVHAHGTTWTTSLLTELADRGAPVVLCGANH 72 Query: 83 VR------VYASGQPGGARSDKLLYQAKLALDEDLRLKVVR-KMFELRFGEPAPARRSVE 135 + G + +A L + + + M V Sbjct: 73 APRSVLMPLDGHHAQGARLRAQWQARAPLVKQAWKQTVIAKIAMQAAALEAMGEPHAPVG 132 Query: 136 QL-RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAA 194 L R + A A+ Y G + D D +N ++ + L T A Sbjct: 133 MLARKVTSGDATNVEAQAARLYWPRMMGTEFRR-DRTAPD-LNALLNYGYTVLRAATARA 190 Query: 195 ILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACR 252 ++AAG P IG H+ + +F D+ + + P R +V A + Sbjct: 191 VVAAGLHPTIGLHHSNRGNAFALADDLMEPFR-----PLVDCCVRGLAARNGPQVDPAAK 245 Query: 253 DIFRSSKTLAKLIPL 267 L + Sbjct: 246 QSLARLIALDLPLGD 260 >UniRef50_C8N6V7 Putative uncharacterized protein n=1 Tax=Cardiobacterium hominis ATCC 15826 RepID=C8N6V7_9GAMM Length = 334 Score = 46.9 bits (110), Expect = 8e-04, Method: Composition-based stats. Identities = 40/255 (15%), Positives = 88/255 (34%), Gaps = 48/255 (18%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ ++ V + + + +P+ + I ++ +S A + + Sbjct: 1 MSTLYIDRQNTRMTVSGNTLIFYENGERASTLPLHIIDRICIKGDLALSAADLGKLGEHN 60 Query: 73 TLLVWVGE--------------AGVRVYASGQPGGARSDKLLYQAKLALDEDLRLK--VV 116 ++ + +R A + QA+ + E ++ + ++ Sbjct: 61 IGVLILSGREQQPTIYLPCARKDALRRLAQAH-FSQDNTFCTRQAQSWITEKIQREQDLL 119 Query: 117 RKMFEL--RFGEPAPARR---------------SVEQLRGIEGSRVRATYALLAKQY--G 157 R++ R G + LRGIEGS +T+A +A Sbjct: 120 RELQTRPHRGGHQLHENLEQLEKSRLRLQNPINDLATLRGIEGSAASSTFAAIACVLPES 179 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCL-YGVTEAAILAAGYAPAIGFVHT---GKPL 213 + + R +P D N +S + L Y + I G P IG+ H+ G+ Sbjct: 180 LHFTKRNRNPPR----DPYNVGLSLGYTLLHYAMVRQ-IHLTGLDPCIGYYHSIEHGRE- 233 Query: 214 SFVYDIADIIKFDTV 228 S D+ + ++ Sbjct: 234 SLACDLIEAMRPLVT 248 >UniRef50_A6DE79 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Caminibacter mediatlanticus TB-2 RepID=A6DE79_9PROT Length = 281 Score = 46.9 bits (110), Expect = 0.001, Method: Composition-based stats. Identities = 26/152 (17%), Positives = 55/152 (36%), Gaps = 10/152 (6%) Query: 133 SVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 S + L GIEG+ + + + T + D +N +S + LY Sbjct: 126 SKDSLLGIEGNFAKEYFKEYFSLFDKTLT--KGYRSKRPPEDVVNALMSYLYTLLYYEIA 183 Query: 193 AAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPK--AFEIARRNPGEPDREVR 248 ++ G+ I ++H +S D+ ++ + D V F+ + ++ Sbjct: 184 NRLIFYGFEVGISYLHESFRDHMSLASDLLEVFRSDVDVFVYEMFDNKKV----IKKDFT 239 Query: 249 LACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 + IF S+ ++ I+D +I Sbjct: 240 KEKKGIFLRSEKRKEIWSDIKDFFENLKIDEE 271 >UniRef50_A6VLA8 CRISPR-associated protein Cas1 n=13 Tax=Proteobacteria RepID=A6VLA8_ACTSZ Length = 305 Score = 46.5 bits (109), Expect = 0.001, Method: Composition-based stats. Identities = 40/239 (16%), Positives = 86/239 (35%), Gaps = 29/239 (12%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTR 59 MTW I + G++ + LI + G +P+ +A +++E T Sbjct: 1 MTW---RSILMSKG--------GKLSLQQNQM-LIQQEGNEFTVPLEDIAIVVVESRETV 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGA-----RSDKLLYQAKLALDEDLRLK 114 ++ + G + E + P ++ KL +A L + L Sbjct: 49 ITIPLLSAFGLHGVTFLTCDEQFLPC-GQWLPFNQYHRQLKTLKLQLEASLPQKKQLWQA 107 Query: 115 VVRKMFELRFGEPAPARRSVEQLRGIEGS-RVRA-----TYALLAKQYGVTWNGRRYDPK 168 +V++ + G + E R ++ + +V++ A A Y T G+ + Sbjct: 108 IVQQKIRNQAGVLKICKFQAESDRLLKMAEKVKSGDKENLEAQSAVIYFQTLFGKGFKRS 167 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF 225 + E +N ++ + + A++ G+ P IG H + +F D + + Sbjct: 168 EDES--AVNSALNYGYTVMRSAVARALVLYGWLPQIGLFHRSELNAFNLADDFIEPFRP 224 >UniRef50_A9AX66 CRISPR-associated protein Cas1 n=1 Tax=Herpetosiphon aurantiacus ATCC 23779 RepID=A9AX66_HERA2 Length = 350 Score = 46.5 bits (109), Expect = 0.001, Method: Composition-based stats. Identities = 41/284 (14%), Positives = 80/284 (28%), Gaps = 56/284 (19%) Query: 18 IFL--QYGQIDVIDGAFVLI---DKTGIR----THIPVGSVACIMLEPGTRVSHAAVRLA 68 ++L QY + A + D+ R +P+ + ++++ ++ +A+ Sbjct: 4 LYLSEQYSIVKREGEALRVEIPEDQQLGRQRQVVRVPLNVIERVVVQGEITLTASALACL 63 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------- 119 + ++ +G A + L Q R + R Sbjct: 64 LERRICTHFLSYSGRSQGALTPDPTRNASLRLAQYAAHTSIQHRFSLARTFVDGKLRNLR 123 Query: 120 -FELRFGE-PAPARRS--VEQLR----------------------------GIEGSRVRA 147 LRF + +E+LR G EG A Sbjct: 124 TQILRFNRSQREPTLTQAIERLRDAHRDLHGLSIPEYVDPLDRMHGMGQILGCEGQGSAA 183 Query: 148 TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 + W + + D +N +S L + + G+ P IGF+ Sbjct: 184 YWDCWGMLLNQPW--EWHGRRRRPPPDPVNALLSYGYVILTSQVLSQLAIVGFDPYIGFL 241 Query: 208 HT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVR 248 H GKP + D+ + + V + R Sbjct: 242 HQSSFGKP-ALALDLMEEFRPVIVDSVVLTVLNTKILNQQHFQR 284 >UniRef50_C9RJP2 CRISPR-associated protein Cas1 n=21 Tax=cellular organisms RepID=C9RJP2_FIBSS Length = 298 Score = 46.2 bits (108), Expect = 0.001, Method: Composition-based stats. Identities = 16/100 (16%), Positives = 34/100 (34%), Gaps = 11/100 (11%) Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKA 232 N ++ + L GV A++++G P +G H + ++ DI + + V Sbjct: 174 PPNNMLNYGYAILRGVVARALVSSGLLPTLGIHHHNRYNAYCLADDIMEPYRP-IVDKLI 232 Query: 233 FEIARRNPGEP--------DREVRLACRDIFRSSKTLAKL 264 E+ P + +R+ D + Sbjct: 233 LEVINEIEEYPTDLSTEIKAKLLRIPVLDCVIDGNRSPLM 272 >UniRef50_C0WE67 Crispr-associated protein n=1 Tax=Acidaminococcus sp. D21 RepID=C0WE67_9FIRM Length = 290 Score = 45.4 bits (106), Expect = 0.002, Method: Composition-based stats. Identities = 27/214 (12%), Positives = 68/214 (31%), Gaps = 11/214 (5%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTLLVWVG 79 Q ++D + + I + + +++E G ++ + L++ Sbjct: 10 QRCKLDFCMNYVEVQTAV-SKKRIFIDEIKTLIIENTGVAITAYLLSELMNRKVNLIFCD 68 Query: 80 EAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLR-----LKVVRKMFELRFGEPAPARRS 133 +A + S + + Q + E + + + F + + Sbjct: 69 KAHNPQSSLLPLHARFDSIRKIKQQMIWSQEIKDEVWDCIVKAKIRQQALFLDELEKKEQ 128 Query: 134 VEQLRG-IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 + L ++ + V +N + + IN ++ S L + Sbjct: 129 SKMLMSYLDDVVSADAHNREGHAAKVYFNALFGNSFTRDLDSPINAGLNYGYSLLLSLFN 188 Query: 193 AAILAAGYAPAIGFVHTG--KPLSFVYDIADIIK 224 I++AGY +G H P +F D+ + + Sbjct: 189 REIVSAGYLTQLGIFHENTYNPYNFSCDLMEPFR 222 >UniRef50_B0VHC1 Putative uncharacterized protein n=1 Tax=Candidatus Cloacamonas acidaminovorans RepID=B0VHC1_9BACT Length = 344 Score = 45.4 bits (106), Expect = 0.002, Method: Composition-based stats. Identities = 22/95 (23%), Positives = 34/95 (35%), Gaps = 11/95 (11%) Query: 137 LRGIEGSRVRATYALLAKQYG---VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEA 193 L G EG + +A ++GR P D +N +S + + L Sbjct: 155 LLGYEGIAAKNYFAAFPDLIANPDFPFSGRNKRPPK----DEVNAMLSLSYTFLMNQVMC 210 Query: 194 AILAAGYAPAIGFVHT---GKPLSFVYDIADIIKF 225 A G P G +H G+ S V DI + + Sbjct: 211 AAYICGLDPYYGALHDLDYGRQ-SLVLDIMEEFRP 244 >UniRef50_C7G696 CRISPR-associated protein Cas1 n=2 Tax=Roseburia RepID=C7G696_9FIRM Length = 301 Score = 45.4 bits (106), Expect = 0.003, Method: Composition-based stats. Identities = 43/225 (19%), Positives = 83/225 (36%), Gaps = 21/225 (9%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGE 80 + ++ + + + +T I IP+ + CI++E T VS ++ A +G + E Sbjct: 11 HVKLSIKNQQLNI--ETDIARQIPLEDINCIIIENQTVTVSAYLLQKMADMGIAVYVCDE 68 Query: 81 AGVRVYASGQPGGARSD---KLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV--- 134 + A P S L YQ + RL + ++R A + Sbjct: 69 KHLPN-AVLLPMVRHSRHFKILKYQIEAGKPLQKRLWQQIVVQKIRNQALCLAYLELDGS 127 Query: 135 EQLRGI-----EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 E+L + G R A A Y + G + + IN ++ + + G Sbjct: 128 EELMKMCKEVQSGDRT-HVEAKAAAFYFKSLYGLGFSRGNDH---IINAALNYGYAIVRG 183 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKA 232 + +I+ G P+IG H + +F D+ + + + A Sbjct: 184 LIARSIVCYGLEPSIGVFHHSELNNFNLADDMIEPFRPLVDLYVA 228 >UniRef50_Q03JI7 CRISPR-associated protein, Cas1 family n=40 Tax=Bacilli RepID=Q03JI7_STRTD Length = 289 Score = 45.4 bits (106), Expect = 0.003, Method: Composition-based stats. Identities = 36/293 (12%), Positives = 82/293 (27%), Gaps = 48/293 (16%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV--SHAAVRLAAQVGTLLVWVG 79 + ++ + + + I + + ++LE T + + V+ L+++ Sbjct: 12 HSKLSYKNNHLIFRNSYKTEM-IHLSEIDILLLET-TDIVLTTMLVKRLVDENILVIFCD 69 Query: 80 EAG------VRVYA--------SGQPGGARSDKLLYQAKLAL----DEDLRLKVVRKMFE 121 + YA + Q + K + ++ L + Sbjct: 70 DKRLPTAFLTPYYARHDSSLQIARQIAWKENVKCEVWTAIIAQKILNQSYYLGECSFFEK 129 Query: 122 LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCIS 181 + EG R + L +G + E + IN + Sbjct: 130 SQSIMELYHGLERFDPSNREGHSARIYFNTL---FGNDFTR--------ESDNDINAALD 178 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRN 239 + L + ++ G IG H + DI + + P I +N Sbjct: 179 YGYTLLLSMFAREVVVCGCMTQIGLKHANQFNQFNLASDIMEPFR-----PIIDRIVYQN 233 Query: 240 PGEPDREVRLACRDIFRSSKT-------LAKLIPLI-EDVLAAGEIQPPAPPE 284 +++ IF + L+ ++ + V+ A PE Sbjct: 234 RHNNFVKIKKELFSIFSETYLYNGKEMYLSNIVSDYTKKVIKALNQLGEEIPE 286 >UniRef50_D0WRI5 CRISPR-associated protein Cas1, NMENI subtype n=3 Tax=Actinomycetales RepID=D0WRI5_9ACTO Length = 238 Score = 45.0 bits (105), Expect = 0.004, Method: Composition-based stats. Identities = 27/119 (22%), Positives = 40/119 (33%), Gaps = 11/119 (9%) Query: 152 LAKQYGVTWNG------RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIG 205 A+ + W R P N + A + L G A+L AG +P IG Sbjct: 84 EARAARLYWQALWGEESFRRHPGLGSGESCRNSHLDYAYTVLRGHGIRAVLGAGLSPTIG 143 Query: 206 FVHTGKPLSF--VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLA 262 H G+ +F V DI + + + + N D EVR + L Sbjct: 144 LFHHGRSNNFALVDDIIEPFRP--AIDSSVSRLAPNADMKDPEVRKHLVAA-ADQRFLP 199 >UniRef50_B3DR65 Putative uncharacterized protein n=1 Tax=Bifidobacterium longum DJO10A RepID=B3DR65_BIFLD Length = 139 Score = 44.6 bits (104), Expect = 0.005, Method: Composition-based stats. Identities = 16/84 (19%), Positives = 31/84 (36%), Gaps = 4/84 (4%) Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFE 234 N + A L G A++AAG P++G H G+ F D+ ++ + + Sbjct: 19 NAQLDYAYMVLRGFAVKAVIAAGLIPSLGVNHHGRGNYFCLADDLLEVYRPAVDYRVS-- 76 Query: 235 IARRNPGEPDREVRLACRDIFRSS 258 + ++ V+ D Sbjct: 77 LLNDEDSLQEKAVKKHLVDAVNQQ 100 >UniRef50_B9CMG3 CRISPR-associated protein Cas1, NMENI subtype n=1 Tax=Atopobium rimae ATCC 49626 RepID=B9CMG3_9ACTN Length = 215 Score = 44.2 bits (103), Expect = 0.005, Method: Composition-based stats. Identities = 29/164 (17%), Positives = 56/164 (34%), Gaps = 24/164 (14%) Query: 97 DKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY 156 DK+ +QA+ L+ R ++ +++F L E + E R + L + Sbjct: 34 DKITHQAQ-VLNARAREEIGQQLFGL-IPEVRSGDTT-----NREAHAARLYFHAL---F 83 Query: 157 GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFV 216 G ++ + IN + + L I+A GY G H + F Sbjct: 84 GHEFSR--------DDETPINAALDYGYAILLSAVNREIVARGYLTQSGICHRSEYNQFN 135 Query: 217 Y--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 D + + V F+ G+ ++ + D+ S Sbjct: 136 LGCDFMEPFRP-IVDRLVFDNV---EGDFTKDTKRLLIDMLNQS 175 >UniRef50_UPI000174611D CRISPR-associated protein Cas1/Cas4 n=1 Tax=Verrucomicrobium spinosum DSM 4136 RepID=UPI000174611D Length = 847 Score = 44.2 bits (103), Expect = 0.006, Method: Composition-based stats. Identities = 48/302 (15%), Positives = 85/302 (28%), Gaps = 63/302 (20%) Query: 5 PLNPIPLKDRVSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P + +D ++L + V+ D + + + + + +++ Sbjct: 485 PRRLMAARDDARALYLSTPGYHVGRSGELLVVKDGASLVEEFRINDLTNVAVFGNVQITT 544 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE- 121 AV++ + L + G + + Q + A D L + R+ Sbjct: 545 QAVQVLCEKEIPLAYFSTGGWFYGLTRGHVTKNVFTRIEQFRAADDPMRCLALSRRFVAG 604 Query: 122 ---------LRFGEPAPAR---------------RSVEQLRGIEGSRVRATYALLA---- 153 +R PA RS+E L GIEG+ A Sbjct: 605 KIRNHRTLLMRLHVEPPAAVLARLKQASQDALGARSLETLLGIEGAAAALYLQHFAGMIK 664 Query: 154 ------------------------KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + + R P D +N +S A S L Sbjct: 665 VGAADDDDEIPGLESASATRVPEESVFTFDFTKRSRRPP----TDPVNALLSLAYSLLAK 720 Query: 190 VTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 A A G+ P +GF H G+P + D+ + + PD Sbjct: 721 DCTIAAHAVGFDPYVGFYHQPRYGRP-ALALDLMEEFRPLVAESVVLTAINNRMLVPDHF 779 Query: 247 VR 248 VR Sbjct: 780 VR 781 >UniRef50_C2EF73 CRISPR-associated Cas1 family protein n=1 Tax=Lactobacillus salivarius ATCC 11741 RepID=C2EF73_9LACO Length = 308 Score = 43.8 bits (102), Expect = 0.007, Method: Composition-based stats. Identities = 20/127 (15%), Positives = 40/127 (31%), Gaps = 12/127 (9%) Query: 141 EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGY 200 E R ++ L + ++ RYD D IN ++ + L ++ G Sbjct: 156 EAYAARVYFSNL---FSKSFKRGRYD-------DIINSSLNYGYAILRSAIRKELVIYGL 205 Query: 201 APAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 P+ G H F DI ++ + + E D E++ + Sbjct: 206 EPSWGIHHVSTENPFNLSDDIIEVYRPFLDALVVELLNNNESEELDIELKKEIIKVLFEK 265 Query: 259 KTLAKLI 265 + + Sbjct: 266 CIIDNKV 272 >UniRef50_D1AUW5 CRISPR-associated protein Cas1 n=1 Tax=Streptobacillus moniliformis DSM 12112 RepID=D1AUW5_STRM9 Length = 308 Score = 43.8 bits (102), Expect = 0.008, Method: Composition-based stats. Identities = 29/162 (17%), Positives = 52/162 (32%), Gaps = 31/162 (19%) Query: 135 EQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAA 194 ++ EG+ + + L YG + + + D+IN + YGV +A Sbjct: 143 SDVQNREGTAAKVFFNYL---YGTNFCRQ-------NERDSINMALDYG----YGVFRSA 188 Query: 195 ILAA----GYAPAIGFVHTGKPLSF--VYDIADIIKFDTVVPKAFEIARRNPGEPD--RE 246 I G+A IG H+ +F YD + + + F R + + + Sbjct: 189 ITRLLCTYGFATYIGVHHSSMMNAFNLTYDFIEPYRP-IIDYYVFNHLYRFEKDDELKTD 247 Query: 247 VRLACRDIFR--------SSKTLAKLIPLIEDVLAAGEIQPP 280 R + L + LI+ L E Sbjct: 248 TRKELISLLNANIKVNEKEYTVLYSMELLIKSYLNFLEEGNE 289 >UniRef50_C8PNV3 CRISPR-associated protein Cas1 n=1 Tax=Treponema vincentii ATCC 35580 RepID=C8PNV3_9SPIO Length = 296 Score = 43.1 bits (100), Expect = 0.013, Method: Composition-based stats. Identities = 31/225 (13%), Positives = 82/225 (36%), Gaps = 23/225 (10%) Query: 18 IFLQYGQ-IDVIDGAFVLIDKTG-IRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTL 74 +F + + V V+ + T +P+ + +++E ++ + + Sbjct: 6 LFFSHAVCLSVKHKQLVIFSEETQEETLVPIEDIGFVIVENERVSLTIPLINELTENNCA 65 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM------FELRFGEPA 128 L++ E + +++ Q + + +L V +K ++++ Sbjct: 66 LIFCNEKHMPF---SMTMPLDCNEIQSQL-FSAQINAKLPVKKKCWKQVVEYKIKNQGLL 121 Query: 129 PARRSVEQLRGIE-GSRVRA-----TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 + ++ R + RV++ + AK Y G+ + + G+ N ++ Sbjct: 122 LKKYDLDFARLADFSKRVKSGDSTNMESQAAKFYWDNLFGKNWCRNRF--GEFPNNYLNY 179 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF 225 + L T A+ +G PA+G H K ++ D+ + + Sbjct: 180 GYAILRAATARALAGSGLLPALGIHHHNKYNAYCLADDLMEPYRP 224 >UniRef50_B8I084 CRISPR-associated protein Cas1 n=1 Tax=Clostridium cellulolyticum H10 RepID=B8I084_CLOCE Length = 298 Score = 42.3 bits (98), Expect = 0.024, Method: Composition-based stats. Identities = 33/222 (14%), Positives = 65/222 (29%), Gaps = 34/222 (15%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ + + G IP+ + I+L+ ++ A + A+ L Sbjct: 13 KLKLKQNNLWVEQSDG--FSIPIDDINTIVLDSADVTITSALLSKLAEEDIALYSCDGKH 70 Query: 83 VRV--YASGQPGGARSDKLLYQAKLALDEDLR---LKVVRKMFELRF--------GEPAP 129 + + Q L+ R V +K+ F G Sbjct: 71 TPNGVLLPFSCHSRQYKIVKTQINLSAPFKKRCWQRVVQQKIENQAFCLNILELKGRDEL 130 Query: 130 ARRSVEQLRG----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 S L G +E + +++L + D N ++ S Sbjct: 131 INLSKSVLSGDSTNVEAHAAKYYFSVL------------FTNFKRGMQDNTNYALNYGYS 178 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKF 225 L G I + G+ P+IG H + +F D + + Sbjct: 179 ILRGAVARTIASYGFIPSIGIHHRSELNNFNLADDFIEPFRP 220 >UniRef50_A9GQD8 Putative uncharacterized protein n=1 Tax=Sorangium cellulosum 'So ce 56' RepID=A9GQD8_SORC5 Length = 310 Score = 40.4 bits (93), Expect = 0.080, Method: Composition-based stats. Identities = 36/194 (18%), Positives = 55/194 (28%), Gaps = 41/194 (21%) Query: 91 PGGARSDKLLYQAKLALDED-----------LRLKVVRKMFELRFGEPAPARRS------ 133 + + Q + A DE ++K R M P A S Sbjct: 56 LESRNVELRVAQHRAASDEAFCLSFARGVVVSKIKNARTMLRRNHAAPEVAVLSELDQLA 115 Query: 134 --------VEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYDPKDWEKGDTIN 177 + L GIEG+ R + A + GR P D +N Sbjct: 116 RKAAEAPSLPSLLGIEGTAARVYFGAFAGMLKGAGEARGEFDLEGRNRRPPR----DPVN 171 Query: 178 QCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIADIIKFDTVVPKAFE 234 +S A + L + G P +GF H G+P + D+ + + Sbjct: 172 ALLSLAYALLAKDLATTLGTVGLDPLLGFYHQPRFGRP-ALALDLIEEFRPIVADSVVVA 230 Query: 235 IARRNPGEPDREVR 248 PD R Sbjct: 231 AINNGVVAPDDFQR 244 Searching..................................................done Results from round 3 Score E Sequences producing significant alignments: (bits) Value Sequences used in model and found again: UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular ... 326 9e-88 UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria... 305 1e-81 UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomyc... 295 1e-78 UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 292 1e-77 UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococc... 290 4e-77 UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 289 9e-77 UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular ... 285 1e-75 UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomy... 285 1e-75 UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiop... 284 3e-75 UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subty... 283 5e-75 UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicr... 281 3e-74 UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacteriu... 279 6e-74 UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=A... 278 2e-73 UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria ... 277 3e-73 UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomyc... 274 3e-72 UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 264 2e-69 UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus cur... 264 3e-69 UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus ... 261 2e-68 UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=A... 244 4e-63 UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=B... 242 9e-63 UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax... 238 2e-61 UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetel... 233 5e-60 UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodofera... 225 2e-57 UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfoto... 224 2e-57 UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria... 222 9e-57 UniRef50_C1ZJF3 CRISPR-associated protein, Cas1 family; CRISPR-a... 221 2e-56 UniRef50_D0MKV7 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 220 4e-56 UniRef50_UPI00016C522C CRISPR-associated protein Cas1/Cas4 n=1 T... 220 4e-56 UniRef50_Q467D6 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Meth... 218 3e-55 UniRef50_Q74H36 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Geob... 217 3e-55 UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Ac... 216 9e-55 UniRef50_C7QUZ4 CRISPR-associated protein Cas1 n=8 Tax=Cyanobact... 215 1e-54 UniRef50_C3WS02 CRISPR-associated protein n=2 Tax=Fusobacterium ... 215 1e-54 UniRef50_B9K7F7 CRISPR-associated protein, Cas1 family n=1 Tax=T... 214 4e-54 UniRef50_B7KMR5 CRISPR-associated protein Cas1 n=3 Tax=Chroococc... 214 5e-54 UniRef50_B8GLF6 CRISPR-associated Cas1/Cas4 family protein n=1 T... 210 4e-53 UniRef50_A5ILM3 CRISPR-associated protein, Cas1 family n=3 Tax=B... 210 5e-53 UniRef50_UPI0001C16754 protein of unknown function DUF48 n=1 Tax... 208 2e-52 UniRef50_C9M4E6 CRISPR-associated protein cas1 n=1 Tax=Lactobaci... 207 3e-52 UniRef50_C4G3M4 Putative uncharacterized protein n=1 Tax=Abiotro... 207 3e-52 UniRef50_C1DUM1 Crispr-associated protein Cas1 n=18 Tax=Bacteria... 207 5e-52 UniRef50_B9YDC3 Putative uncharacterized protein n=1 Tax=Holdema... 207 6e-52 UniRef50_B7IHY4 Cas crispr-associated protein Cas1 n=4 Tax=Bacte... 206 7e-52 UniRef50_D2NTT1 Uncharacterized protein predicted to be involved... 206 7e-52 UniRef50_Q1CWU5 CRISPR-associated protein Cas1 n=1 Tax=Myxococcu... 204 3e-51 UniRef50_C0BZ41 Putative uncharacterized protein n=3 Tax=Clostri... 204 4e-51 UniRef50_B0K547 CRISPR-associated protein Cas1 n=12 Tax=Bacteria... 203 6e-51 UniRef50_O57912 Putative uncharacterized protein PH0173 n=1 Tax=... 202 9e-51 UniRef50_A1A2M8 CRISPR-associated DNA polymerase n=14 Tax=Bacter... 202 1e-50 UniRef50_A7HNI6 CRISPR-associated protein Cas1 n=1 Tax=Fervidoba... 202 1e-50 UniRef50_C9LM09 CRISPR-associated protein Cas1 n=1 Tax=Dialister... 201 3e-50 UniRef50_Q2SIC8 CRISPR-associated protein Cas1 n=6 Tax=Gammaprot... 200 4e-50 UniRef50_B1X158 DUF48-containing protein n=12 Tax=Cyanobacteria ... 200 5e-50 UniRef50_A1BI39 CRISPR-associated protein Cas1 n=5 Tax=Chlorobia... 199 1e-49 UniRef50_C2GEC7 Crispr-associated protein Cas1 n=2 Tax=Corynebac... 198 2e-49 UniRef50_Q2RY11 CRISPR-associated protein, Cas1 family / CRISPR-... 198 2e-49 UniRef50_C0QR16 Crispr-associated protein Cas1 n=23 Tax=Bacteria... 198 2e-49 UniRef50_D0LSW9 CRISPR-associated protein Cas1 n=1 Tax=Haliangiu... 198 2e-49 UniRef50_A7HMV0 CRISPR-associated protein Cas1 n=3 Tax=Thermotog... 198 3e-49 UniRef50_A1HM55 CRISPR-associated protein Cas1 n=2 Tax=Thermosin... 197 6e-49 UniRef50_A5D0Y0 Uncharacterized protein n=40 Tax=cellular organi... 196 8e-49 UniRef50_A9AYP8 CRISPR-associated protein Cas1 n=1 Tax=Herpetosi... 195 1e-48 UniRef50_C1XWQ6 CRISPR-associated protein, Cas1 family n=3 Tax=T... 195 1e-48 UniRef50_A5UXM3 CRISPR-associated protein, Cas1 family n=4 Tax=C... 195 1e-48 UniRef50_C9M4Y8 CRISPR-associated protein Cas1 n=1 Tax=Jonquetel... 195 1e-48 UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonel... 195 2e-48 UniRef50_C0A724 CRISPR-associated Cas1/Cas4 family protein n=1 T... 195 2e-48 UniRef50_Q3B3C1 CRISPR-associated protein, Cas1 family n=20 Tax=... 195 2e-48 UniRef50_C5BP90 CRISPR-associated protein Cas1 n=3 Tax=Gammaprot... 195 2e-48 UniRef50_A4J500 CRISPR-associated protein Cas1 n=1 Tax=Desulfoto... 195 2e-48 UniRef50_A2SRR7 CRISPR-associated protein, Cas1 family n=24 Tax=... 194 3e-48 UniRef50_C7G6C1 CRISPR-associated protein Cas1 n=3 Tax=Firmicute... 194 3e-48 UniRef50_C4FMU2 Putative uncharacterized protein n=1 Tax=Veillon... 193 6e-48 UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=N... 193 6e-48 UniRef50_B2RM83 CRISPR-associated protein Cas1 n=3 Tax=Bacteroid... 192 9e-48 UniRef50_Q1Q3I6 Putative uncharacterized protein n=1 Tax=Candida... 192 1e-47 UniRef50_UPI000197AF65 hypothetical protein BACCOPRO_02409 n=1 T... 192 1e-47 UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiotherm... 192 2e-47 UniRef50_B4W4R1 CRISPR-associated protein Cas1 n=1 Tax=Microcole... 191 2e-47 UniRef50_Q8F1F5 Putative uncharacterized protein n=2 Tax=Leptosp... 191 3e-47 UniRef50_B9LWK7 CRISPR-associated protein Cas1 n=4 Tax=Halobacte... 191 3e-47 UniRef50_D0KYZ2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria ... 190 6e-47 UniRef50_O66692 Putative uncharacterized protein n=2 Tax=Aquific... 189 9e-47 UniRef50_B8DWG2 CRISPR-associated protein Cas1 n=4 Tax=Bifidobac... 189 9e-47 UniRef50_Q2NH78 Putative uncharacterized protein n=1 Tax=Methano... 189 1e-46 UniRef50_B0JKW9 CRISPR-associated protein Cas1 n=6 Tax=Cyanobact... 189 1e-46 UniRef50_Q2FQQ2 CRISPR-associated protein Cas1 n=2 Tax=Methanomi... 189 1e-46 UniRef50_B7KM77 CRISPR-associated protein Cas1 n=1 Tax=Cyanothec... 189 2e-46 UniRef50_Q8YWX6 Alr1468 protein n=4 Tax=Cyanobacteria RepID=Q8YW... 187 3e-46 UniRef50_B7GYY4 CRISPR-associated protein Cas1 n=8 Tax=Proteobac... 187 3e-46 UniRef50_Q1J1U7 CRISPR-associated protein Cas1 n=1 Tax=Deinococc... 187 3e-46 UniRef50_B2SPB2 Crispr-associated protein Cas1 n=56 Tax=Bacteria... 186 7e-46 UniRef50_Q53W21 Putative uncharacterized protein TTHB145 n=3 Tax... 186 9e-46 UniRef50_B9M5J4 CRISPR-associated protein Cas1 n=9 Tax=Bacteria ... 185 2e-45 UniRef50_C0FSR1 Putative uncharacterized protein n=1 Tax=Rosebur... 184 2e-45 UniRef50_B7C8S2 Putative uncharacterized protein n=2 Tax=Eubacte... 184 3e-45 UniRef50_Q6L317 DNA polymerase n=2 Tax=Thermoplasmatales RepID=Q... 184 3e-45 UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria... 184 3e-45 UniRef50_Q0AW57 CRISPR-associated protein, Cas1 family n=1 Tax=S... 183 4e-45 UniRef50_B8G918 CRISPR-associated protein Cas1 n=5 Tax=Chlorofle... 183 5e-45 UniRef50_A4X3M4 CRISPR-associated protein, Cas1 family n=4 Tax=A... 182 1e-44 UniRef50_C9RRG3 CRISPR-associated protein Cas1 n=1 Tax=Fibrobact... 182 2e-44 UniRef50_B5IHG3 CRISPR-associated protein Cas1 n=3 Tax=Acidulipr... 181 2e-44 UniRef50_UPI0000F51762 hypothetical protein Faci_00015 n=1 Tax=F... 181 2e-44 UniRef50_Q65S18 Putative uncharacterized protein n=1 Tax=Mannhei... 181 2e-44 UniRef50_C7RP03 CRISPR-associated protein Cas1 n=1 Tax=Candidatu... 180 4e-44 UniRef50_A4FJX8 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Sacc... 180 4e-44 UniRef50_D1N0J7 CRISPR-associated protein Cas1 n=1 Tax=Victivall... 180 5e-44 UniRef50_D0MJ58 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 179 1e-43 UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobac... 179 1e-43 UniRef50_UPI0001C41A73 CRISPR-associated protein Cas1-2 n=1 Tax=... 178 2e-43 UniRef50_D1VVR4 CRISPR-associated endonuclease Cas1, DVULG subty... 178 2e-43 UniRef50_A9BUF1 CRISPR-associated protein Cas1 n=33 Tax=Proteoba... 178 2e-43 UniRef50_Q6L363 DNA polymerase n=1 Tax=Picrophilus torridus RepI... 178 2e-43 UniRef50_B7A8Y4 CRISPR-associated protein Cas1 n=1 Tax=Thermus a... 177 4e-43 UniRef50_B8GSH8 CRISPR-associated protein Cas1 n=1 Tax=Thioalkal... 176 9e-43 UniRef50_C8W3G7 CRISPR-associated protein Cas1 n=24 Tax=Bacteria... 176 9e-43 UniRef50_Q74N45 NEQ017 n=1 Tax=Nanoarchaeum equitans RepID=Q74N4... 175 1e-42 UniRef50_Q8YZS6 Alr0381 protein n=6 Tax=Cyanobacteria RepID=Q8YZ... 175 2e-42 UniRef50_Q2LQX3 Uncharacterized protein predicted to be involved... 173 5e-42 UniRef50_B9LX94 CRISPR-associated protein Cas1 n=2 Tax=Halobacte... 173 5e-42 UniRef50_D0MKP4 CRISPR-associated protein Cas1 n=1 Tax=Rhodother... 173 9e-42 UniRef50_D2LF35 CRISPR-associated protein Cas1 n=1 Tax=Rhodomicr... 172 1e-41 UniRef50_C7NA04 CRISPR-associated protein Cas1 n=1 Tax=Leptotric... 171 2e-41 UniRef50_A1BI46 CRISPR-associated protein Cas1 n=2 Tax=Chlorobia... 171 2e-41 UniRef50_A9AX66 CRISPR-associated protein Cas1 n=1 Tax=Herpetosi... 171 2e-41 UniRef50_Q2J7N9 CRISPR-associated protein Cas1 n=1 Tax=Frankia s... 171 3e-41 UniRef50_A9GDF7 Putative uncharacterized protein n=1 Tax=Sorangi... 170 4e-41 UniRef50_B3EG05 CRISPR-associated protein Cas1 n=11 Tax=Bacteria... 170 4e-41 UniRef50_Q0ADY5 CRISPR-associated protein, Cas1 family n=2 Tax=N... 170 7e-41 UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=M... 169 1e-40 UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular ... 169 1e-40 UniRef50_A0LHZ4 CRISPR-associated protein, Cas1 family n=2 Tax=D... 168 1e-40 UniRef50_B0TFX3 Crispr-associated protein cas1 n=4 Tax=Clostridi... 168 1e-40 UniRef50_C8WTR3 CRISPR-associated protein Cas1 n=2 Tax=Alicyclob... 168 2e-40 UniRef50_A7GY67 Crispr-associated protein Cas1 n=6 Tax=Campyloba... 166 7e-40 UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=... 165 1e-39 UniRef50_A8UXX8 Putative uncharacterized protein n=1 Tax=Hydroge... 165 2e-39 UniRef50_D0W646 CRISPR-associated protein Cas1 n=1 Tax=Neisseria... 165 2e-39 UniRef50_A4FXZ8 CRISPR-associated protein, Cas1 family n=9 Tax=c... 165 2e-39 UniRef50_B5YJS2 Crispr-associated protein Cas1 n=1 Tax=Thermodes... 164 3e-39 UniRef50_A7NP58 CRISPR-associated protein Cas1 n=6 Tax=Chlorofle... 163 9e-39 UniRef50_A7BYC5 Protein containing DUF48 n=1 Tax=Beggiatoa sp. P... 163 9e-39 UniRef50_A6UNF5 CRISPR-associated protein Cas1 n=1 Tax=Methanoco... 163 9e-39 UniRef50_B0VIK5 Putative uncharacterized protein n=1 Tax=Candida... 162 1e-38 UniRef50_B6IX22 CRISPR-associated protein Cas1, putative n=2 Tax... 162 2e-38 UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacteri... 160 6e-38 UniRef50_A5UJ50 Uncharacterized protein predicted to be involved... 159 9e-38 UniRef50_Q03KT5 CRISPR-associated protein, Cas1 family n=5 Tax=S... 159 1e-37 UniRef50_P71636 CRISPR-associated protein Cas1 n=11 Tax=Mycobact... 159 1e-37 UniRef50_O28401 Putative uncharacterized protein n=1 Tax=Archaeo... 158 2e-37 UniRef50_C3MWK6 CRISPR-associated protein Cas1 n=6 Tax=Sulfolobu... 157 4e-37 UniRef50_B8GDW2 CRISPR-associated protein Cas1 n=1 Tax=Methanosp... 157 5e-37 UniRef50_C3MX12 CRISPR-associated protein Cas1 n=11 Tax=Sulfolob... 157 6e-37 UniRef50_B1YCK7 CRISPR-associated protein Cas1 n=2 Tax=Thermopro... 155 2e-36 UniRef50_C9LGP6 CRISPR-associated protein Cas1 n=3 Tax=Prevotell... 155 2e-36 UniRef50_Q1CW50 CRISPR-associated fusion protein Cas4/Cas1 n=5 T... 154 3e-36 UniRef50_C5SD37 CRISPR-associated protein Cas1 n=1 Tax=Allochrom... 153 9e-36 UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwen... 152 1e-35 UniRef50_A8ABK8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccu... 152 2e-35 UniRef50_C0W0W5 CRISPR-associated Cas1 family protein n=1 Tax=Ac... 151 3e-35 UniRef50_Q2FPW6 CRISPR-associated protein Cas1 n=1 Tax=Methanosp... 149 1e-34 UniRef50_A6UVX9 CRISPR-associated protein Cas1 n=2 Tax=Methanoco... 149 1e-34 UniRef50_C8PKY6 Putative CRISPR-associated protein Cas1 n=1 Tax=... 148 2e-34 UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma... 148 2e-34 UniRef50_A1WUP2 CRISPR-associated protein Cas1 n=1 Tax=Halorhodo... 148 3e-34 UniRef50_D0YU98 Crispr-associated protein Cas1 n=1 Tax=Mobiluncu... 147 6e-34 UniRef50_A3CTI4 CRISPR-associated protein Cas1 n=1 Tax=Methanocu... 146 8e-34 UniRef50_C8N6V7 Putative uncharacterized protein n=1 Tax=Cardiob... 146 1e-33 UniRef50_B3PMY9 Putative uncharacterized protein n=1 Tax=Mycopla... 145 2e-33 UniRef50_B4ATI8 Crispr-associated protein Cas1 n=5 Tax=Proteobac... 145 2e-33 UniRef50_UPI0001BCCAFD hypothetical protein SnoxA4_00467 n=1 Tax... 145 2e-33 UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfuro... 142 2e-32 UniRef50_A6DE79 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Cami... 142 2e-32 UniRef50_A3DLB7 CRISPR-associated protein, Cas1 family n=1 Tax=S... 142 2e-32 UniRef50_Q8F874 Putative uncharacterized protein n=1 Tax=Leptosp... 141 3e-32 UniRef50_A7I668 CRISPR-associated protein Cas1 n=1 Tax=Candidatu... 140 5e-32 UniRef50_A3MVN2 CRISPR-associated protein, Cas1 family n=4 Tax=T... 140 5e-32 UniRef50_D2QAN8 CRISPR-associated protein Cas1 n=2 Tax=Bifidobac... 140 6e-32 UniRef50_C6I8L1 CRISPR-associated protein n=1 Tax=Bacteroides sp... 140 7e-32 UniRef50_A0LWB2 CRISPR-associated protein Cas1 n=1 Tax=Acidother... 138 2e-31 UniRef50_Q7MRD4 Putative uncharacterized protein n=1 Tax=Wolinel... 138 2e-31 UniRef50_A1WH94 CRISPR-associated protein, Cas1 family n=5 Tax=P... 138 2e-31 UniRef50_B5IAF4 CRISPR-associated protein Cas1 n=3 Tax=Euryarcha... 138 2e-31 UniRef50_A3ZPG0 Putative uncharacterized protein n=1 Tax=Blastop... 137 4e-31 UniRef50_D2R8Z2 CRISPR-associated protein Cas1 n=1 Tax=Pirellula... 137 5e-31 UniRef50_Q7MTH7 CRISPR-associated protein Cas1 n=3 Tax=Porphyrom... 135 1e-30 UniRef50_B1GZM4 CRISPR-associated protein Cas1 n=1 Tax=unculture... 135 2e-30 UniRef50_Q9YCL8 Putative CRISPR-associated protein Cas1 n=1 Tax=... 135 2e-30 UniRef50_Q1WVJ8 CRISPR-associated protein n=1 Tax=Lactobacillus ... 134 4e-30 UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscil... 134 4e-30 UniRef50_Q8TVS6 Uncharacterized conserved protein n=1 Tax=Methan... 134 4e-30 UniRef50_B1L400 CRISPR-associated protein Cas1 n=2 Tax=Archaea R... 133 6e-30 UniRef50_Q5X8T5 Putative uncharacterized protein n=1 Tax=Legione... 133 6e-30 UniRef50_C9RJP2 CRISPR-associated protein Cas1 n=21 Tax=cellular... 132 2e-29 UniRef50_A2SQK9 Uncharacterized protein predicted to be involved... 131 2e-29 UniRef50_A2BKJ8 Universally conserved protein n=1 Tax=Hypertherm... 131 2e-29 UniRef50_B2KB47 CRISPR-associated protein Cas1 n=2 Tax=Elusimicr... 131 2e-29 UniRef50_B9M9X7 Putative uncharacterized protein n=1 Tax=Diaphor... 131 3e-29 UniRef50_C7V674 Predicted protein n=2 Tax=Enterococcus faecalis ... 131 3e-29 UniRef50_Q13CC1 CRISPR-associated protein, Cas1 family n=2 Tax=R... 130 5e-29 UniRef50_B2UP48 CRISPR-associated protein Cas1 n=1 Tax=Akkermans... 130 5e-29 UniRef50_A8TI31 CRISPR-associated protein Cas1 n=1 Tax=Methanoco... 130 7e-29 UniRef50_A8LN06 CRISPR-associated protein Cas1 n=2 Tax=Rhodobact... 129 1e-28 UniRef50_A6DE65 Putative uncharacterized protein n=1 Tax=Caminib... 128 2e-28 UniRef50_A6VLA8 CRISPR-associated protein Cas1 n=13 Tax=Proteoba... 128 2e-28 UniRef50_C0WRP8 CRISPR-associated protein n=18 Tax=Lactobacillac... 127 6e-28 UniRef50_B4AQ39 Crispr-associated protein Cas1 n=5 Tax=Francisel... 126 9e-28 UniRef50_C6MJ62 CRISPR-associated protein Cas1 n=5 Tax=Nitrosomo... 125 2e-27 UniRef50_Q03LF6 CRISPR-associated protein, Cas1 family n=6 Tax=S... 124 3e-27 UniRef50_C3WD45 CRISPR-associated protein cas1 n=1 Tax=Fusobacte... 123 7e-27 UniRef50_C6HZN2 CRISPR-associated protein Cas1 n=1 Tax=Leptospir... 123 1e-26 UniRef50_B3W9S5 CRISPR-associated protein n=4 Tax=Lactobacillus ... 120 8e-26 UniRef50_B3E1C9 CRISPR-associated protein Cas1 n=1 Tax=Methylaci... 118 3e-25 UniRef50_C7XMU1 CRISPR-associated protein cas1 n=2 Tax=Fusobacte... 116 9e-25 UniRef50_UPI00016B206F hypothetical protein cdiviTM7_00753 n=1 T... 115 3e-24 UniRef50_A8REI1 Putative uncharacterized protein n=2 Tax=unclass... 114 4e-24 UniRef50_A8ABE8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccu... 114 4e-24 UniRef50_A7HP88 CRISPR-associated protein Cas1 n=1 Tax=Parvibacu... 114 5e-24 UniRef50_Q5LZX6 Putative uncharacterized protein n=2 Tax=Strepto... 113 8e-24 UniRef50_B5ZLL1 CRISPR-associated protein Cas1 n=2 Tax=Gluconace... 113 8e-24 UniRef50_D2PIT7 CRISPR-associated protein Cas1 n=2 Tax=Sulfolobu... 105 2e-21 UniRef50_A1RZT8 CRISPR-associated protein Cas1 n=1 Tax=Thermofil... 103 6e-21 UniRef50_C5EZ73 Crispr-protein cas1 n=1 Tax=Helicobacter pulloru... 92 3e-17 Sequences not found previously or not previously below threshold: UniRef50_Q57823 Uncharacterized protein MJ0378 n=20 Tax=Euryarch... 185 2e-45 UniRef50_UPI000174611D CRISPR-associated protein Cas1/Cas4 n=1 T... 173 5e-42 UniRef50_B0VHC1 Putative uncharacterized protein n=1 Tax=Candida... 164 4e-39 UniRef50_C5EH11 Crispr-protein cas1 n=1 Tax=Clostridiales bacter... 159 1e-37 UniRef50_B8I084 CRISPR-associated protein Cas1 n=1 Tax=Clostridi... 123 7e-27 UniRef50_C0WE67 Crispr-associated protein n=1 Tax=Acidaminococcu... 123 1e-26 UniRef50_C7G696 CRISPR-associated protein Cas1 n=2 Tax=Roseburia... 122 2e-26 UniRef50_A9GQD8 Putative uncharacterized protein n=1 Tax=Sorangi... 121 2e-26 UniRef50_Q03JI7 CRISPR-associated protein, Cas1 family n=40 Tax=... 116 7e-25 UniRef50_C2EF73 CRISPR-associated Cas1 family protein n=1 Tax=La... 116 1e-24 UniRef50_C8PNV3 CRISPR-associated protein Cas1 n=1 Tax=Treponema... 115 2e-24 UniRef50_Q6KIQ8 Conserved expressed putative DNA-repair protein ... 113 9e-24 UniRef50_B1BJM4 Crispr-associated protein Cas1 n=2 Tax=Clostridi... 113 1e-23 UniRef50_Q5HK87 CRISPR-associated protein Cas1 n=1 Tax=Staphyloc... 112 2e-23 UniRef50_D1AUW5 CRISPR-associated protein Cas1 n=1 Tax=Streptoba... 111 3e-23 UniRef50_C2CKI7 CRISPR associated protein n=1 Tax=Anaerococcus t... 110 6e-23 UniRef50_Q5HSQ9 CRISPR-associated protein Cas1 n=12 Tax=Campylob... 109 1e-22 UniRef50_C4ZA17 Putative uncharacterized protein n=2 Tax=Eubacte... 105 2e-21 UniRef50_UPI0001977683 putative CRISPR-associated protein n=1 Ta... 104 5e-21 UniRef50_Q73QW5 CRISPR-associated protein Cas1 n=7 Tax=Bacteria ... 102 1e-20 UniRef50_A7BYT7 Putative uncharacterized protein n=1 Tax=Beggiat... 101 4e-20 UniRef50_C5NZ03 CRISPR-associated protein Cas1 n=2 Tax=Firmicute... 100 8e-20 UniRef50_D0WRI5 CRISPR-associated protein Cas1, NMENI subtype n=... 99 2e-19 UniRef50_Q4A5I1 Putative uncharacterized protein n=5 Tax=Mycopla... 98 5e-19 UniRef50_C5F1H0 Crispr-protein cas1 n=1 Tax=Helicobacter pulloru... 97 8e-19 UniRef50_B9CMG3 CRISPR-associated protein Cas1, NMENI subtype n=... 85 3e-15 >UniRef50_Q46896 Uncharacterized protein ygbT n=119 Tax=cellular organisms RepID=YGBT_ECOLI Length = 305 Score = 326 bits (835), Expect = 9e-88, Method: Composition-based stats. Identities = 305/305 (100%), Positives = 305/305 (100%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV Sbjct: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF Sbjct: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 Query: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI Sbjct: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP Sbjct: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 Query: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA Sbjct: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDA 300 Query: 301 GHRSS 305 GHRSS Sbjct: 301 GHRSS 305 >UniRef50_D1NTI3 CRISPR-associated protein Cas1 n=10 Tax=Bacteria RepID=D1NTI3_9BIFI Length = 366 Score = 305 bits (781), Expect = 1e-81, Method: Composition-based stats. Identities = 106/296 (35%), Positives = 162/296 (54%), Gaps = 4/296 (1%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 + +DR++ ++ ++ ++ + A + D G+R HIP +++ +ML PGT V+H A+ Sbjct: 32 ELVRCEDRLTFLYFEHCVVNRDNNAITVTDDRGVR-HIPAAALSVLMLGPGTSVTHQAMM 90 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 + G ++WVGE GVR Y SG+P S+ L QA+L + RL V R M+ +RF Sbjct: 91 VIGDNGATVIWVGERGVRTYCSGKPLTHSSNLLQKQAQLVTNMRKRLSVARAMYAMRFPH 150 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + +++QLRG EG+RVR Y +KQ GV W R Y P+D+ D INQ +SAA C Sbjct: 151 EDVSNLTMQQLRGREGARVRRVYRHWSKQTGVRWERRDYRPEDFADSDRINQALSAANIC 210 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 LYG+ A I+A G +P +GFVHTG LSFVYD+AD+ K + +P AF+ A + Sbjct: 211 LYGIAHAVIVALGCSPGLGFVHTGHELSFVYDMADLYKAELSIPVAFKTAATEVDDIGGA 270 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP---EDAQPVAIPLPVSLGD 299 VR A RD + +++ I + A + + D + + S GD Sbjct: 271 VRRAMRDAMYDLSIMPRMVKDIHHLFDAADAENEGNNLYLWDGKEGTVEAGRSYGD 326 >UniRef50_D1A5U2 CRISPR-associated protein Cas1 n=3 Tax=Actinomycetales RepID=D1A5U2_THECD Length = 315 Score = 295 bits (756), Expect = 1e-78, Method: Composition-based stats. Identities = 104/297 (35%), Positives = 154/297 (51%), Gaps = 7/297 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + DR+S I+L+ + D A D G THIP ++ C++L PGTRV+H A Sbjct: 12 PRELTRMSDRISFIYLERCTLHREDNAITAEDADG-ITHIPSATIGCLLLGPGTRVTHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + G +VWVGE GVR Y+ G+ S + QA + RL+V R M+ +RF Sbjct: 71 MSVLGDSGAGVVWVGEQGVRFYSGGRSLTRSSALVEAQAIKWANRRTRLEVARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + PA + ++L G EG RV+ Y A +YG+TW GR Y P D+ D +NQ ++AA Sbjct: 131 PDEDPAGLTRQELLGREGRRVKERYRQEAAKYGITWKGRHYIPGDFGSSDPVNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ + + A G +P +GF+H+G L+FV DIAD+ K + +P AF +P + Sbjct: 191 QCMYGIAQTTVAALGCSPGLGFIHSGHELAFVLDIADLYKTEFALPIAFRTVAESPEDVG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLGDAG 301 R A RD L + + I+ +L P P+D I GD G Sbjct: 251 SRTRRAIRDEVNRVGLLRRCVDDIKSLL------LPDVPDDPLNSDIDQVTLQGDHG 301 >UniRef50_D1CAI8 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=D1CAI8_SPHTD Length = 314 Score = 292 bits (747), Expect = 1e-77, Method: Composition-based stats. Identities = 120/311 (38%), Positives = 180/311 (57%), Gaps = 14/311 (4%) Query: 4 LPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 + L+ +P ++D S +++++ +I+ A + D G+ +P S+ +ML PGT +SH Sbjct: 1 MDLHILPKVRDSWSYLYVEHARIEQEAKAIAIHDAVGM-VPVPCASLGILMLGPGTSISH 59 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AA+R A+ G L++W GE GVR YA G + L+ QA+L D LRL+VV +M+++ Sbjct: 60 AAIRTLAENGCLVLWTGEEGVRFYAQGLGETRSARNLMRQARLWADPALRLRVVFRMYQM 119 Query: 123 RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 RF EP P +++Q+RG+EG+RVR YA +++ GV W GR + ++W D IN+ +S Sbjct: 120 RFSEPLPPDLTLQQIRGMEGARVRDAYARASRETGVPWRGRSFQRRNWSATDPINRALSC 179 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 A SCLYG+ AAI++ GY+P +GF+HTGK LSFVYDIAD+ K +P AF + + Sbjct: 180 ANSCLYGICHAAIVSLGYSPGLGFIHTGKMLSFVYDIADLYKATVTIPLAFRVVAEGTHD 239 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDAQ--------PVA 290 + VR ACRD F + + L + IE VL A G P EDA A Sbjct: 240 LEGRVRRACRDAFVAHRLLGTIATDIEHVLDISDADGGADEPDFDEDAAYPGGLWDPSGA 299 Query: 291 IPLPVSLGDAG 301 + V+ G Sbjct: 300 VAGGVNHAPEG 310 >UniRef50_Q2JWC7 CRISPR-associated protein Cas1 n=3 Tax=Chroococcales RepID=Q2JWC7_SYNJA Length = 315 Score = 290 bits (743), Expect = 4e-77, Method: Composition-based stats. Identities = 112/303 (36%), Positives = 175/303 (57%), Gaps = 6/303 (1%) Query: 2 TWLPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 T L IP ++D +S ++++ +I+ A ++ + G IP S+ +ML PGT + Sbjct: 6 TTKSLRSIPKVRDSISFVYVERCRIEQDAKAIAVLQEDGRYI-IPCASLTTLMLGPGTAI 64 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 +HAA++ A + WVGE G+R YASG + ++L +QAKL D ++VVR+M+ Sbjct: 65 THAAIKNLADGLCSVQWVGEDGLRFYASGSHPSSSVERLYHQAKLWADPVQHMEVVRRMY 124 Query: 121 ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 RF EP ++EQ+RG+EG RVR Y+ L+K+ GV W GR Y K+WE D +N+ + Sbjct: 125 SFRFPEPLKEGLTLEQIRGLEGVRVRTVYSRLSKETGVNWKGRSYKLKEWECADPVNRAL 184 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNP 240 S A +CLY V +AA+ A GY+ A+GF+H GKPLSFVYD+AD+ K + +P AF+ A Sbjct: 185 SVANTCLYAVCQAALNAVGYSTALGFIHIGKPLSFVYDVADLYKTEITIPVAFKAAAELM 244 Query: 241 GEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVS 296 + R CR+ F + + ++I ++ +L Q + P D + A+ VS Sbjct: 245 PNFESRTRQLCREKFVEHRLMQRIIDDVDAILGFRATQEESSPVGSLWDNEKGAVEGGVS 304 Query: 297 LGD 299 G+ Sbjct: 305 YGE 307 >UniRef50_C7QEM2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=C7QEM2_CATAD Length = 323 Score = 289 bits (740), Expect = 9e-77, Method: Composition-based stats. Identities = 95/307 (30%), Positives = 156/307 (50%), Gaps = 13/307 (4%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + DRVS ++L+ + A D G THIP ++ ++L PGTR++H A Sbjct: 12 PYQLPRIADRVSFVYLERCTVHRDANAITAQDADG-ITHIPSATIGTLLLGPGTRITHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + G + WVGE G R YA+ + S + QA L + RL + R M+ +RF Sbjct: 71 MAVLGDCGANVAWVGEHGARFYAAARSLNRSSALVEAQATLWANRRTRLDIARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + P+ +QL G+EG R++ Y +++ GV W+GR+Y P ++ GD INQ I+AA Sbjct: 131 PDEDPSGFMRQQLLGMEGRRLKDCYRQQSQRTGVPWHGRQYTPGNFNAGDAINQAITAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YGV I A G +P +GF+H+G LSFV DIAD+ K + +P AF+ A + + Sbjct: 191 QCMYGVAHTIITALGCSPGLGFIHSGHELSFVMDIADLYKTEIGIPVAFDTAAEDSTDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP------------EDAQPVAIP 292 R A R+ R+++ L + + ++ +L +P + + + + Sbjct: 251 PRTRRALREQIRTTRLLERCVDDVKALLTTPNNEPGSSDATDTGFFDQVQLQSDRETEVE 310 Query: 293 LPVSLGD 299 + D Sbjct: 311 GGRNYAD 317 >UniRef50_D1CGD6 CRISPR-associated protein Cas1 n=7 Tax=cellular organisms RepID=D1CGD6_THET1 Length = 324 Score = 285 bits (730), Expect = 1e-75, Method: Composition-based stats. Identities = 124/299 (41%), Positives = 175/299 (58%), Gaps = 5/299 (1%) Query: 1 MTWLPLNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTR 59 M L+ +P + D S +++++ +ID A L D TG +P S++ +ML PGT Sbjct: 1 MRLKDLHILPRVSDSWSYLYVEHCRIDQDARAISLHDATGKTM-VPCASLSLLMLGPGTS 59 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM 119 ++HAA++ A G L+ WVGE GVR YA G + L QA L D +L L+VVR+M Sbjct: 60 ITHAAIQTLADNGCLVAWVGEEGVRFYAQGMGETRSATNTLRQAMLWSDPELHLQVVRRM 119 Query: 120 FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQC 179 +E+RF P S++Q+RG+EG+RVR Y L+++ GV W GR Y K W D IN+ Sbjct: 120 YEIRFRHPINPNTSLKQIRGMEGARVRGAYLQLSRETGVEWKGRDYSSKSWHSNDAINRA 179 Query: 180 ISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRN 239 ISAA SCLYGV AAI++AGY+ A+GF+HTGK LSFVYD+AD+ K + +P AF Sbjct: 180 ISAANSCLYGVCHAAIVSAGYSTALGFIHTGKMLSFVYDVADLYKTEISMPAAFYAVAEG 239 Query: 240 PGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVSLG 298 + VR CRDI R ++ LA+++ I+ VL + P P + P V G Sbjct: 240 GASLESRVRRKCRDILRETRLLARIVEDIDTVL---NVDSPIPHKYQNPYDSDPGVPGG 295 >UniRef50_B4V4P5 Crispr-associated protein cas1 n=4 Tax=Streptomyces RepID=B4V4P5_9ACTO Length = 315 Score = 285 bits (729), Expect = 1e-75, Method: Composition-based stats. Identities = 101/301 (33%), Positives = 155/301 (51%), Gaps = 7/301 (2%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G THIP ++ ++L PGTR++H A Sbjct: 12 PRELTRVAERISFVYLERCVVHRDANAITAEDADGT-THIPSATIGTLLLGPGTRITHQA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + A+ G + WVGE GVR YA G+ S + QA L + RL+V R M+ LRF Sbjct: 71 MSVLAESGAAVAWVGEQGVRYYAGGRALSRSSALVEAQATLWANRRTRLEVARAMYRLRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + P+ + +L G EG RV+ Y A + GV W GR Y P D+ GD NQ ++AA Sbjct: 131 PDEDPSGLTRRELLGHEGYRVKECYRHQADRTGVPWRGRHYVPGDFTAGDAPNQAVTAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C+YG+ A + A G A +GFVH+G LSFV D+AD+ K + +P AF++A + + Sbjct: 191 QCMYGIAHAVVAALGCATGLGFVHSGHELSFVLDVADLYKTEIGIPVAFDVAAESTEDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPED------AQPVAIPLPVSLG 298 R A RD ++ L + + I+ +L + +D VA+P + Sbjct: 251 SRTRRALRDAVNKNRLLDRCVNDIKLLLQPEGPGAASAADDVVMLESDSGVAVPAGRNYA 310 Query: 299 D 299 D Sbjct: 311 D 311 >UniRef50_C1YVP5 CRISPR-associated protein Cas1 n=1 Tax=Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111 RepID=C1YVP5_NOCDA Length = 327 Score = 284 bits (726), Expect = 3e-75, Method: Composition-based stats. Identities = 102/302 (33%), Positives = 148/302 (49%), Gaps = 6/302 (1%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 P + +R+S ++L+ + A D G R +P ++ ++L PGT V+H+A Sbjct: 12 PRELTRVGERLSFLYLERCVVHRDSNAITAEDGDGTRY-LPSATIGTLLLGPGTNVTHSA 70 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + L + G +VWVGE GVR YA+G+ S + QA + RL V R M+ +RF Sbjct: 71 MSLLGECGATVVWVGEHGVRYYAAGRALTRSSRLVEAQATAWANRRSRLDVARAMYRMRF 130 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + S + L G EG RV+A Y A + GVTW GRRY P D + D N+ I+AA Sbjct: 131 PDLDVEALSRQALLGKEGDRVKACYREQAARTGVTWRGRRYVPGDHDVSDPPNKAITAAA 190 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 C YGV A A G +P +GFVH+G FV D+AD+ K + +P AF+ A + + D Sbjct: 191 QCFYGVAHAVTAALGCSPGLGFVHSGHERGFVMDVADLYKVEIGIPVAFDAAAQGDEDVD 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIED-VLAAGEIQPPAPPEDAQPVAIPLPVS----LGD 299 R RD L + + I+ +L G + EDA A +GD Sbjct: 251 GVTRRLLRDRINEEGLLERCVRDIKALLLGEGSVGAQGEAEDAGESANDDVADTVGLVGD 310 Query: 300 AG 301 G Sbjct: 311 RG 312 >UniRef50_D2RB04 CRISPR-associated endonuclease Cas1, ECOLI subtype n=3 Tax=Bacteria RepID=D2RB04_GARVA Length = 313 Score = 283 bits (724), Expect = 5e-75, Method: Composition-based stats. Identities = 105/296 (35%), Positives = 166/296 (56%), Gaps = 7/296 (2%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 + DRVS I++++ +I+ +D A + D G +P + ++L PGT ++H A+ L Sbjct: 17 RISDRVSFIYVEHAKINRLDSAVTVFDANGTI-RVPAAMIGVLLLGPGTEITHRAMELLG 75 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP 129 VG +VWVGE GVR YA G+ S L Q+KL + RL V RKM+++RF Sbjct: 76 DVGASIVWVGEHGVRNYAHGRALSRSSRLLEKQSKLVTNSRSRLNVARKMYQMRFPNENV 135 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + +++QLRG EG+RVR Y ++ +Y V WNGR Y D+E G +N+ +S CLYG Sbjct: 136 SSYTLQQLRGREGARVRHLYREMSNKYNVQWNGRDYKVNDFESGTVVNKALSVGNVCLYG 195 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR--NPGEPDREV 247 + + I A G AP +GFVHTG LS VYDIAD+ K + +P +FEIA R + + ++ + Sbjct: 196 LVHSIISALGLAPGLGFVHTGHDLSLVYDIADLYKAELTIPASFEIAARCESDDDIEQLM 255 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 RL RD F + +++++ I+++L D + + + V+ + Sbjct: 256 RLKMRDCFANCNIMSRIVNDIQNLLEIPIDDQITVDVIHLWDDKELLVASGVNYSE 311 >UniRef50_C7LYW4 CRISPR-associated protein Cas1 n=1 Tax=Acidimicrobium ferrooxidans DSM 10331 RepID=C7LYW4_ACIFD Length = 314 Score = 281 bits (718), Expect = 3e-74, Method: Composition-based stats. Identities = 100/296 (33%), Positives = 162/296 (54%), Gaps = 5/296 (1%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 P+ R S ++L++ + A V + ++G T++P +V ++L PGTR++H A+ Sbjct: 17 ELQPVSRRSSFVYLEHCVVHRDANAVVSVTESGT-TYLPAAAVGTLLLGPGTRITHQAML 75 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 L + G ++ WVGE R+YA + L QA+L + RL+V R+M+++RF Sbjct: 76 LLGESGVVVCWVGEGDTRLYAWAPSLFQSTRFLEAQARLVSNRQDRLRVARQMYQMRFPG 135 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 ++ ++++LRG+EG+R+R TY LA +G+ W+GR YDP + GD +N+ +S A S Sbjct: 136 EDVSKATMQRLRGMEGARIRRTYRHLASAFGIDWHGRHYDPNNSSAGDDVNRALSIANSV 195 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 LYGV AI+A G +P +GFVHTG LSFVYD+AD+ K + +P AFE A + G + Sbjct: 196 LYGVVHTAIVALGCSPGLGFVHTGHSLSFVYDVADLYKVELAIPVAFEAAAQRTGSLSSQ 255 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLG 298 VR R+ + L + + I +L + P D + +P Sbjct: 256 VRRTMRERIHEAHLLERAVDDIRLLLGTPDADLGGEPGLVLFDDRIGEVPAGTDYS 311 >UniRef50_B1VIX8 CRISPR-associated protein n=6 Tax=Corynebacterium RepID=B1VIX8_CORU7 Length = 312 Score = 279 bits (715), Expect = 6e-74, Method: Composition-based stats. Identities = 93/271 (34%), Positives = 154/271 (56%), Gaps = 1/271 (0%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 + DR+S ++++ + A + D+ G H+P +A ++L GTR+++AA+ L Sbjct: 17 MGDRISFLYVERAVVSRDGNALTITDQRG-VAHVPATQLAVLLLGTGTRITNAAMALLGD 75 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 G VWVGE GVR YA G+P S QA++ ++ RL+ R+M+ LRF + Sbjct: 76 CGVSTVWVGERGVRYYAHGRPPAKSSRLAELQARVVTNQRKRLECARRMYGLRFPGEDVS 135 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + ++ QLRG EG+R++ YA AK+ GV WN RRYDP D++ D INQ ++ ++ LYG+ Sbjct: 136 KLTMAQLRGREGARMKRLYAAEAKRTGVAWNRRRYDPNDYDSSDPINQALTTGSAALYGI 195 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A I+ G+ PA+G +HTG SFVYD+AD+ K + +P AF + VR Sbjct: 196 AHAVIVGLGFVPALGVIHTGTDRSFVYDVADLYKAEVSIPAAFNAVASGTEDVGPMVRRL 255 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 RD + + +++ ++ V++ + +P + Sbjct: 256 VRDAVVEQRLMPRMVRDLKFVMSVPDDEPLS 286 >UniRef50_C7MTM5 CRISPR-associated protein, Cas1 family n=6 Tax=Actinomycetales RepID=C7MTM5_SACVD Length = 342 Score = 278 bits (712), Expect = 2e-73, Method: Composition-based stats. Identities = 112/301 (37%), Positives = 169/301 (56%), Gaps = 7/301 (2%) Query: 4 LPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 P + L DRVS ++++ +D + A +I++ +P VA ++L PGTRV+H Sbjct: 11 HPHDLHRLTDRVSSVYIERSHLDRAENAIAIINRRET-VRLPAALVAVVLLGPGTRVTHG 69 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 A++L A GT + WVGE GVR+YA+G + L QA L RL+V R M+ +R Sbjct: 70 AMQLLADSGTAVCWVGEQGVRMYAAGLGPSRGAALLQRQAYLVSRTTTRLEVARAMYAMR 129 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD-WEKGDTINQCISA 182 F + +++QLRG EG+RVR Y A+Q+GV WNGR Y D + GD +N+ +SA Sbjct: 130 FPGEDVSTLTMQQLRGREGARVRKVYRQQARQHGVPWNGRAYKAGDAFAVGDDLNRLLSA 189 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 A + LYG+ A I+ G +P +GF+HTG SFV DIAD+ K + +P AF++A R E Sbjct: 190 ANAALYGICHAVIVGLGASPGLGFIHTGSATSFVMDIADLYKAEYTIPLAFQLAARGLLE 249 Query: 243 PDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLG 298 +R+ R A RD + L ++I ++ +LA + P P D + +P V+ Sbjct: 250 -ERDARTALRDRIAGTGLLPRIIKDVKTLLAPEGVDLPDPEVNLLWDERGNPVPGGVNWS 308 Query: 299 D 299 D Sbjct: 309 D 309 >UniRef50_Q3ZZ81 CRISPR-associated protein Cas1 n=4 Tax=Bacteria RepID=Q3ZZ81_DEHSC Length = 309 Score = 277 bits (709), Expect = 3e-73, Method: Composition-based stats. Identities = 116/283 (40%), Positives = 174/283 (61%), Gaps = 1/283 (0%) Query: 6 LNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAV 65 +DR S ++L+ G++DV + +P+ + +ML PG+ V+HAA+ Sbjct: 5 HELPRFRDRWSYLYLEMGRLDVEADSLGFHQGD-TVVPVPIDQLGVVMLGPGSTVTHAAI 63 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG 125 + +Q L+ W G+ GVR+YA+ G + +L+ QA+L D++ RL+V +M+ RF Sbjct: 64 KSLSQNNCLIAWTGQDGVRLYAASIGGTYSARRLIRQARLVSDDEKRLEVAWRMYRFRFN 123 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 E P S+E +RG+EG RVR YA +++YGV W GR YD KDW KGD IN+ +SAA + Sbjct: 124 EVIPPVVSLESIRGMEGIRVRRAYAKASQEYGVEWKGRHYDQKDWSKGDPINRALSAANA 183 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 CLYG+ A IL+AGY+ A+GFVHTGK LSFVYD+AD+ K + +P AF++A NP + +R Sbjct: 184 CLYGICHAGILSAGYSSALGFVHTGKMLSFVYDVADLYKTELTIPVAFKVAAANPTDLER 243 Query: 246 EVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQP 288 +VR+ CR+ F K L +L+ I +VL + +P E Sbjct: 244 QVRIECREAFYEFKLLERLLTDIAEVLGVSDDIGESPDEFEGR 286 >UniRef50_A8M406 CRISPR-associated protein Cas1 n=5 Tax=Actinomycetales RepID=A8M406_SALAI Length = 322 Score = 274 bits (700), Expect = 3e-72, Method: Composition-based stats. Identities = 114/293 (38%), Positives = 160/293 (54%), Gaps = 5/293 (1%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 +DR+S ++L+ I A D+ G HIP ++ +ML PGT ++ A+ L A Sbjct: 16 RAQDRISFVYLERCVIHRDSNAITATDEKG-IVHIPAATLGVLMLGPGTSITQQAMMLIA 74 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP 129 G +VW+GE GVR YA G+P S L+ QA D RL+V R M+ +RF Sbjct: 75 DNGATVVWIGEHGVRYYAHGRPLARSSRLLVAQAAAVSHRDRRLRVARAMYRMRFPGEDT 134 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 +++QLRG EG+RVR Y A++ GV+WN R YDP D+ D +NQ +SAA +CLYG Sbjct: 135 TNLTMQQLRGKEGARVRRCYRENAQRTGVSWNSREYDPDDFTGSDPVNQALSAAHACLYG 194 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRL 249 + A ++A G +P +GFVHTG SFVYDIAD+ K D +P AF+IA + + R Sbjct: 195 IVHAVVVAVGASPGLGFVHTGHDRSFVYDIADLYKADVTIPVAFDIAAAESTDIGADTRR 254 Query: 250 ACRDIFRSSKTLAKLIPLIEDVL----AAGEIQPPAPPEDAQPVAIPLPVSLG 298 A RD + L + + I +L AAG I E+A A+ L G Sbjct: 255 AVRDRVHNGALLGRCVQDIRRLLLTDSAAGPINEEEFDEEADNDAVRLWDEGG 307 >UniRef50_C7MTL6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=C7MTL6_SACVD Length = 328 Score = 264 bits (676), Expect = 2e-69, Method: Composition-based stats. Identities = 96/319 (30%), Positives = 161/319 (50%), Gaps = 26/319 (8%) Query: 4 LPLNPI---------PLKDRVSMIFLQYGQIDVIDGAF--VLIDKTGIRTHI--PVGSVA 50 P + + D +S ++L+ ++ D + G + + PV +++ Sbjct: 7 HPRRLLTAPTVAMLPRVADSLSFLYLENVRVVQDDTGVCAYVEQPDGGTSRVYLPVAAIS 66 Query: 51 CIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDED 110 CI+ GT V+ A+ A+ T ++W G GVR+Y+ ++ L Q + D+ Sbjct: 67 CILFGTGTSVTQPAMATCARHNTTVLWTGSGGVRMYSGSLAPNLTTEWLERQVRAWADDS 126 Query: 111 LRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDW 170 RL V +M+ +RFG PA S+ LRG+EG R++A Y LA ++G+ R YDP +W Sbjct: 127 TRLAVAARMYSMRFGAEVPAGTSLNTLRGLEGQRMKALYRSLADRHGLRGFKRNYDPANW 186 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVP 230 + + +NQ +SAA + LYG +A+LA G +PA+GF+H+GK SFVYD+AD+ K +P Sbjct: 187 GEQNPVNQALSAANTALYGAVHSALLALGCSPALGFIHSGKQHSFVYDVADLYKAKHTIP 246 Query: 231 KAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA---AGEIQPPAPP---- 283 AF + + PDREVR+ R F + + +++ ++ +L A + P Sbjct: 247 LAFALHKSAQ--PDREVRIRMRQDFHLYRLMPRIVRDVQRLLDPSIAQDHDETGEPEEVE 304 Query: 284 ----EDAQPVAIPLPVSLG 298 D A+ V+ G Sbjct: 305 LVHLWDPDLGAVEAGVNHG 323 >UniRef50_C2BS02 CRISPR-associated protein n=1 Tax=Mobiluncus curtisii ATCC 43063 RepID=C2BS02_9ACTO Length = 314 Score = 264 bits (675), Expect = 3e-69, Method: Composition-based stats. Identities = 90/296 (30%), Positives = 152/296 (51%), Gaps = 6/296 (2%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRL 67 ++DR+S ++++ ++ A + D G HIP V ++L PGT+V++AA+ L Sbjct: 15 LSRMEDRLSFLYVERAILNREGNALTIQDSRG-IAHIPATQVGVVLLGPGTKVTYAAMAL 73 Query: 68 AAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP 127 G VWVGE GVR YA G+P S A+L ++ RL+ R+M+ +RF Sbjct: 74 LGDAGCSAVWVGEKGVRYYAHGRPAAKTSRMAEAHARLWANQRSRLRCARRMYSMRFPGE 133 Query: 128 APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCL 187 + + QLRG EG+R++ YA +++ GV W R YDP D+ GD IN ++ + L Sbjct: 134 DVSNLPLSQLRGREGARMKRIYAEQSRRTGVPWTRRSYDPNDFGAGDPINCALTEGAAAL 193 Query: 188 YGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREV 247 YG+ A ++ G+ P++G +H+G +FVYD+AD+ K + +P AFE + + V Sbjct: 194 YGIAHAVVVGLGFIPSLGIIHSGTDRAFVYDVADLYKAEISIPAAFEAVAASAEGDELNV 253 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP----EDAQPVAIPLPVSLGD 299 R RD +++ + +++ ++ V+ D I V+ D Sbjct: 254 RKRIRDKVVTTRLMQRMVRDLQYVMEIPTDDAYIDANLLLWDELE-VIAAGVNWAD 308 >UniRef50_Q03C58 CRISPR-associated protein n=3 Tax=Lactobacillus RepID=Q03C58_LACC3 Length = 315 Score = 261 bits (667), Expect = 2e-68, Method: Composition-based stats. Identities = 106/298 (35%), Positives = 169/298 (56%), Gaps = 6/298 (2%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 +++RV+ ++L++ +I+ D A V+ID G IP ++ +ML PG V+H A+ Sbjct: 12 ELSRVRERVTFLYLEHAKINRQDSAIVVIDTGGT-VAIPAALISVLMLGPGVDVTHRAME 70 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 L G +VWVGE GVR YA G+ S L+ QAKL + LR+ V R+M+++RF + Sbjct: 71 LMGDAGMSVVWVGERGVRQYAPGRALTHSSALLVAQAKLVSNNRLRVGVARQMYQMRFPD 130 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + S+++LRG EG+RVR Y +++ GV W R YDP++++ G INQ ++AA + Sbjct: 131 DDVSTLSMQELRGKEGARVRRIYREESRRTGVEWTHREYDPENYQSGSIINQALTAAHAA 190 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR--NPGEPD 244 LYG++ + I+A G +P +GFVHTG LSFVYD AD+ K + +P AF +A + Sbjct: 191 LYGLSYSVIVALGASPGLGFVHTGHDLSFVYDFADLYKAEVTIPIAFTVAANATEQDDIG 250 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP---EDAQPVAIPLPVSLGD 299 + RLA RD F K + +++ ++ +L + A D + VS + Sbjct: 251 QLTRLAVRDAFVDGKLMIRMVADLKMLLGDVDDDLDADVVNLWDDKLGLQKFGVSYRE 308 >UniRef50_Q47PJ6 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=Q47PJ6_THEFY Length = 332 Score = 244 bits (622), Expect = 4e-63, Method: Composition-based stats. Identities = 90/312 (28%), Positives = 156/312 (50%), Gaps = 21/312 (6%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAF--VLIDKTGIRTHIPV--GSVACIMLEPGTRVSHA 63 + D +S +++ +I D + +TG +P+ S+AC++L PGT ++ Sbjct: 16 LPRVSDGLSFLYVDVCRIVQTDTGVCAEVETETGRIHRVPIPTASLACVLLGPGTSITSP 75 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 A+ + T +V G G+ Y S + + QA+ D+ R V +M+E+R Sbjct: 76 AMATFMRHNTTVVTCGAGGILNYGSFPAPNRTTKWIDRQARAYSDDRRRRDVAVRMYEMR 135 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 FGE P S+E+LR +EG+R++A Y LA + V R Y+P DW+ D +N+ +SA+ Sbjct: 136 FGEEPPPGASIERLRQLEGARMKALYRSLAAKNRVKPFKRNYNPHDWDDQDPVNKALSAS 195 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 + LYGV + + G PA+GF+H+GK +FVYDIAD+ K T +P AF ++R P Sbjct: 196 NAALYGVVHSVLAHLGCHPALGFIHSGKQDAFVYDIADLYKARTTIPLAFSLSRT--ANP 253 Query: 244 DREVRLACRDIFRSSKTLAKLIPLIEDVLA----AGEIQPPAPP-----------EDAQP 288 ++E RL R + + + +++ ++ +L+ + PP D Sbjct: 254 EQEARLRLRRDLKLYRLIPQIVRDVQTLLSLDDPEEAVSEEEPPSSGGPWQVVDLWDPVV 313 Query: 289 VAIPLPVSLGDA 300 A+ V+ + Sbjct: 314 GAVSGGVNYANH 325 >UniRef50_C4X9I5 CRISPR-associated Cas1 family protein n=12 Tax=Bacteria RepID=C4X9I5_KLEPN Length = 294 Score = 242 bits (619), Expect = 9e-63, Method: Composition-based stats. Identities = 92/274 (33%), Positives = 147/274 (53%), Gaps = 2/274 (0%) Query: 6 LNPIP-LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 +P +KD+ ++L+ G++++ D + +D G +PV ++ ++L PGT V+H A Sbjct: 16 RELLPQVKDKYPFLYLERGRLEIDDSSVKWVDADGNVVPLPVATINTLLLGPGTTVTHEA 75 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 ++ A + WVGE + YA+G A + L Q LA D LKV R MF RF Sbjct: 76 IKTATAANCAVCWVGEDSLLFYAAGFLPTADTRNLKAQMALACDASSTLKVARAMFAKRF 135 Query: 125 GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + +S+ + G+EGSRVRA Y A++YGV W GR++ P +E D NQ +++ Sbjct: 136 PDADLEGKSLNSMMGMEGSRVRALYQQKAQEYGVGWKGRQFTPGKFELSDLTNQVLTSTN 195 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 + LYG+ + + A GY+P IGF+H+G PL FVYD+AD+ K + AF ++R G D Sbjct: 196 AALYGILCSVVHAMGYSPHIGFIHSGSPLPFVYDLADLYKERLCIDLAFSLSREMAGRYD 255 Query: 245 RE-VRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 + V A R + L + I +++ Sbjct: 256 KHKVSEAFRKRVIALDLLNLIAADINELMGGKGA 289 >UniRef50_B6IWM1 CRISPR-associated protein Cas1, putative n=1 Tax=Rhodospirillum centenum SW RepID=B6IWM1_RHOCS Length = 281 Score = 238 bits (608), Expect = 2e-61, Method: Composition-based stats. Identities = 86/275 (31%), Positives = 136/275 (49%), Gaps = 6/275 (2%) Query: 4 LPLNPIPLKDRVSMIFLQYGQIDVIDGAFVL-IDKTGIRTHIPVGSVACIMLEPGTRVSH 62 L IP K R +I+++ ++ + +G+ V+ D G +P + ++L PG+ ++H Sbjct: 8 LDAARIPQKSRNGLIYVERCRLSIDNGSLVIAFDDRGEELELPYQRLNAVLLGPGSSITH 67 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AVR + GT L +VG G R+Y + S QA E R+ V ++M+ Sbjct: 68 DAVRHCSGHGTCLAFVGSDGTRLYTAPPLFDRDSTLARQQATWWAGESTRIMVAKRMYAK 127 Query: 123 RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 RFGE P S++ LRG+E +R+R +Y L+A Q G+ W GRR+D D + D NQ I+ Sbjct: 128 RFGE-TPRATSLDSLRGMEAARIRHSYELIAAQAGIVWRGRRFDRSDPDGDDLPNQAINH 186 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR---- 238 + + A+ A G P +GF+H S+ DI D+ + VP AF +R Sbjct: 187 VVTAVEACVAIAVQATGTLPPLGFLHEDSAKSWTLDICDLYRTSVTVPLAFRCVKRIDQG 246 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 DR R A R + + +I I++VLA Sbjct: 247 ATDSLDRICRRAVSAHVRDTGFIDTIIDDIKEVLA 281 >UniRef50_C9M9R9 CRISPR-associated protein Cas1 n=1 Tax=Jonquetella anthropi E3_33 E1 RepID=C9M9R9_9BACT Length = 281 Score = 233 bits (595), Expect = 5e-60, Method: Composition-based stats. Identities = 97/279 (34%), Positives = 146/279 (52%), Gaps = 10/279 (3%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGI-----RTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 M++L+ G + V DG + G IP +V+ I+LEPGT ++H RL Q Sbjct: 1 MLWLERGNLFVKDGTLRFVSAGGGSLEKGTYDIPYQNVSMIVLEPGTTITHDVFRLMGQQ 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 GT L+ VG+ GVR Y + G RS Q +L + RL+V M+ +RFGE P R Sbjct: 61 GTGLIAVGDKGVRCYTAPPLGPDRSALARRQVELWANPQTRLQVALAMYAIRFGEELPTR 120 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVT 191 + +E LRGIEG+R+R +Y++LAK YG+TW RR++ K K D IN ++ A S +YG Sbjct: 121 K-IEDLRGIEGARLRKSYSILAKFYGLTWTLRRFNRKQPNKTDDINAAVNHAASAMYGAA 179 Query: 192 EAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP----DREV 247 + A+ A P +GFVH +F DIAD+ + + +P AF EP +R V Sbjct: 180 DIAVAAVSAIPQLGFVHAKSCRAFALDIADLYRTEITLPAAFRGLASYLEEPGMDLERHV 239 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDA 286 R K ++K+I I++++ G+ + Sbjct: 240 RKLIGQELYRQKVISKMIDQIKELILHGQSDEALEEKSP 278 >UniRef50_Q21QB1 CRISPR-associated protein Cas1 n=1 Tax=Rhodoferax ferrireducens T118 RepID=Q21QB1_RHOFD Length = 277 Score = 225 bits (573), Expect = 2e-57, Method: Composition-based stats. Identities = 104/267 (38%), Positives = 152/267 (56%), Gaps = 9/267 (3%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 K+R+ +FL+ G + V DG +L+ + IP V+C+M+EPG V+H A++L + Sbjct: 17 HKNRIPYLFLEKGILRV-DGHCLLLCQAESAIEIPGSMVSCLMIEPGVSVTHEAMKLCGE 75 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 GTLL+WVGE G R YA+ + ++L QA + ++ R+ +++ L F + P Sbjct: 76 NGTLLMWVGEGGTRFYAAAHA-HQDASRVLRQAAIHTNQRERIAAASRLYGLMFDDHMPP 134 Query: 131 RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 ++E+LRG+EGSRV+ Y LA + G+ W GR +N I ATSCLY + Sbjct: 135 SFTIEKLRGLEGSRVKEIYVNLADKLGMVWQGREEKS-------ALNTSIGFATSCLYAL 187 Query: 191 TEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 E AILAAGY P IG VH+G P S V+D+AD +KF TVVP AFEIA +P + VR Sbjct: 188 CEVAILAAGYHPGIGVVHSGNPRSLVFDLADTVKFKTVVPLAFEIAATSPSNLNMAVRHG 247 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEI 277 CRD+F L+ +E++ Sbjct: 248 CRDLFSRESMFETLLGHLENIFGTDHD 274 >UniRef50_C8W2P4 CRISPR-associated protein Cas1 n=1 Tax=Desulfotomaculum acetoxidans DSM 771 RepID=C8W2P4_DESAS Length = 545 Score = 224 bits (572), Expect = 2e-57, Method: Composition-based stats. Identities = 48/308 (15%), Positives = 90/308 (29%), Gaps = 34/308 (11%) Query: 5 PLNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P P+P + ++++ + + + IP+ ++ ++L +S Sbjct: 203 PARPLPSLNLGRVLYVDEPGAYVRKKGERVQVTRDKEVLVDIPLCNLEQLVLAGTVNISA 262 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 ++L GT + +V AG + S + Q K D +LRLK + Sbjct: 263 QVIKLLLDRGTEVHFVSRAGKYYGSLQPALTKNSALRIAQHKAYQDMELRLKYAVLFVQG 322 Query: 123 RFGEPAP-----------------------------ARRSVEQLRGIEGSRVRATYALLA 153 + S+ L GIEG+ R + + Sbjct: 323 KLANMRTILLRYNRDLKEKQLEEAICRLKSLSKNLYKADSLNSLMGIEGAATREYFRVFN 382 Query: 154 KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK-- 211 GD +N +S A + L A++ GY P IGF+H Sbjct: 383 YMIKQHVPFNFQQRSRRPPGDPVNALLSFAYTLLTKDMIASVSIVGYDPYIGFLHRSDYG 442 Query: 212 PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 + D + + + + D + F + K E+ Sbjct: 443 RPALALDFIEEFRPIVADSVVLTVLNKGMINTD-DFEYKMGGCFLNDSGRKKFYRAYEER 501 Query: 272 LAAGEIQP 279 P Sbjct: 502 RHEMISHP 509 >UniRef50_Q0AA34 CRISPR-associated protein Cas1 n=11 Tax=Bacteria RepID=Q0AA34_ALHEH Length = 298 Score = 222 bits (567), Expect = 9e-57, Method: Composition-based stats. Identities = 90/279 (32%), Positives = 142/279 (50%), Gaps = 10/279 (3%) Query: 9 IPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIR-----THIPVGSVACIMLEPGTRVSHA 63 IP DR +++L G++ V DG IP ++ I+L PG+ V+H Sbjct: 18 IPHVDRHGLLWLTRGRLYVEDGTLHFTAAESEDLAAGDYAIPYQGLSMILLGPGSTVTHD 77 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 +RL A+ GTLL +G G + Y + G RSD A L ++ RL V R+M+ R Sbjct: 78 VLRLLARHGTLLAAIGGGGTKYYTAPPMGQGRSDVARRHATLWANKTQRLDVARRMYAFR 137 Query: 124 FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 FG P + + LRGIEG R++ Y + A ++G+ W GRRY+ + D NQ I+ A Sbjct: 138 FGRVLPH-KDIAVLRGIEGGRIKELYRVEASRFGIPWKGRRYNRNNPSAADVPNQAINHA 196 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 + + + A+ A G P +GF+H +F DIAD+ + + VP AF+ AR+ +P Sbjct: 197 ATFVEAAADIAVAATGALPPLGFIHEESSNAFTLDIADLYRGEITVPLAFQAARKVLDDP 256 Query: 244 ----DREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 +R +R F+ K + K+I I+D++ A + Sbjct: 257 TLSIERTLRRDAASAFQRHKVIPKMIDRIKDLINADDNG 295 >UniRef50_C1ZJF3 CRISPR-associated protein, Cas1 family; CRISPR-associated exonuclease, Cas4 family n=1 Tax=Planctomyces limnophilus DSM 3776 RepID=C1ZJF3_PLALI Length = 598 Score = 221 bits (564), Expect = 2e-56, Method: Composition-based stats. Identities = 50/307 (16%), Positives = 90/307 (29%), Gaps = 38/307 (12%) Query: 6 LNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 +P +D I++Q I L IP+ V+ + L +V+ A Sbjct: 261 RKLVPARDDALPIYVQDQGTYIGKDGERLKLTPAKSSPLFIPLIQVSQVCLMGNVQVTAA 320 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 A+R A + + G + + + Q+K A D L + R + Sbjct: 321 AIRELADRNIPISYFSYGGWFTALTSGMCHKNVELRMAQSKAAFDPQAALSIARGFISAK 380 Query: 124 FG-------------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYG- 157 + ++ L G+EG + +A ++ Sbjct: 381 IKNSRTLLRRHADDKHRSDLDRLADYIQKVEQVDNLNSLMGLEGMAAKTYFAGFSRLLRG 440 Query: 158 ---VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--P 212 GR P D +N +S S L A G+ P +GF+H + Sbjct: 441 GDEFNLEGRNRRPP----TDPVNALLSFVYSLLTKELTITTQAVGFDPFLGFLHQPRYGR 496 Query: 213 LSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 S D+A+ + + P +R A + +I E + Sbjct: 497 PSLALDLAEEFRPLVGDSTVLTLINNEEVSPKSFIRRAGSVALTET-GRKAVIAAYERRM 555 Query: 273 AAGEIQP 279 P Sbjct: 556 ETEITHP 562 >UniRef50_D0MKV7 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MKV7_RHOM4 Length = 553 Score = 220 bits (562), Expect = 4e-56, Method: Composition-based stats. Identities = 38/303 (12%), Positives = 91/303 (30%), Gaps = 30/303 (9%) Query: 6 LNPIPLKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 +P ++ +++ Q + ++ + H + ++ + + +++ Sbjct: 216 RRLVPSRNDRLPLYVLSQGSVVKRKGAQLLVQTQDDKDQHFRLIDLSRVSIFGNVQITTQ 275 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE-- 121 A+R + + + AG + + + Q ++A DE L + R + Sbjct: 276 AIRALVEHNIPVFFHSYAGRMLARLVSMYDVNAPVRVAQFEVAADETKSLAIARAIVTGK 335 Query: 122 -----------------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGV 158 R A S+++L GIEG+ R + ++ Sbjct: 336 IKNQRTLLRRNQRTRSERVLRELSRLAREARRASSLDELLGIEGAAARLYFRQFSRMLRH 395 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFV 216 + D +N +S + L A++A G P G H + S Sbjct: 396 RIAFDFKNRNRRPPKDPVNAMLSFLYALLLKDAMCALMATGLDPYRGIFHQMRFGRPSLA 455 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 D+ + + + + + + + K+I E + Sbjct: 456 LDLMEEFRPLIADSVVLRLVNTGAVT-EADFIVRGPACAMKKSAMEKVIEAYEQRMNTIL 514 Query: 277 IQP 279 P Sbjct: 515 RHP 517 >UniRef50_UPI00016C522C CRISPR-associated protein Cas1/Cas4 n=1 Tax=Gemmata obscuriglobus UQM 2246 RepID=UPI00016C522C Length = 571 Score = 220 bits (562), Expect = 4e-56, Method: Composition-based stats. Identities = 49/314 (15%), Positives = 92/314 (29%), Gaps = 43/314 (13%) Query: 6 LNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 IP+ D ++++LQ + V+ + T +P+ ++ +++ +VS Sbjct: 225 KKVIPMTDDGAVLYLQEPGTSVGKRSEHLVVKKEGQELTRVPMHAIRQVVVCGNVQVSTQ 284 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 A+ A + +V G + + Q + D L + + + + Sbjct: 285 ALETLAANDIAVAYVTGHGRFIGSFVPAPAKNVSLREAQFRTFNDPSACLDLAKAVVRAK 344 Query: 124 FGEP-------------------------------APARRSVEQLRGIEGSRVRATYALL 152 A+ SVE + GIEG + Sbjct: 345 LSNQRALLMRSLRGEGEARGSHEYSAKGIYGLLGALDAQTSVESVLGIEGQGAALYFGDF 404 Query: 153 AKQYGVTWNGRRYD---PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 + G+ +D D +N +S A + L + G+ P GF H Sbjct: 405 GRFLKQPPTGKGFDFTTRNRRPPRDPVNALLSFAYAMLAKDCFSVACTVGFDPYKGFFHV 464 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE--VRLACRDIFRSSKTLAKLI 265 G+ S D+ + + + PD R AC + K Sbjct: 465 GRHGKPSLALDLMEEFRPVIADSVVLTLINNEALTPDDFIIWRDACS---LTEKGRRAFF 521 Query: 266 PLIEDVLAAGEIQP 279 E A P Sbjct: 522 AAYEQRKATVVTHP 535 >UniRef50_Q467D6 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Methanosarcina barkeri str. Fusaro RepID=Q467D6_METBF Length = 550 Score = 218 bits (555), Expect = 3e-55, Method: Composition-based stats. Identities = 40/297 (13%), Positives = 89/297 (29%), Gaps = 30/297 (10%) Query: 12 KDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 KD +++ + V+ +P+ ++ + + +S +R Sbjct: 219 KDDKKPVYVTGWGTSVHKKGDRLVIKKNDEELQSVPLRQISQLSIYGDAHISLPVLRSLI 278 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE--- 126 ++ + + G S D ++Q + A D + L + RKM + Sbjct: 279 EMNVPVCYFSFGGWFYGLSHGVMSKNVDLRIHQYQTAFDSERSLAISRKMIAGKIKNCRT 338 Query: 127 ----------------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRR 164 A + + QL GIEG+ + ++ + + Sbjct: 339 LLRRNDTEVSEKILSQLNSLEKKASNAKEIGQLLGIEGTAAQIYFSRFGNMLKQDLDCKF 398 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 + D +N +S L + + G+ P +GF H K + D+ + Sbjct: 399 ENRNKRPPTDPVNAVLSYLYGILTKEVFVTLFSVGFDPYMGFYHQPKYGKPALALDLMEE 458 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + A + + + + + T K+I E + P Sbjct: 459 FRPLIADSVALTLFNNKTVTLE-DFEITNFGVSLKDNTKKKIISGYERRINTEITHP 514 >UniRef50_Q74H36 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Geobacter sulfurreducens RepID=Q74H36_GEOSL Length = 559 Score = 217 bits (554), Expect = 3e-55, Method: Composition-based stats. Identities = 48/318 (15%), Positives = 91/318 (28%), Gaps = 48/318 (15%) Query: 5 PLNPIPLKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P IP R +++Q + V+ ++ +G + + L ++ Sbjct: 211 PRPIIPADGRGLPLYVQSPKAYVRKDGDCLVIEEERVRVAEARLGETSQVALFGNATLTT 270 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AA+ + + W+ G + + G + YQ + + D + L + R+ Sbjct: 271 AALHECLRREIPVTWLSYGGWFMGHTVSTGHRNVETRTYQYQRSFDPETCLNLARRWIVA 330 Query: 123 RFGE-------------------------------PAPARRSVEQLRGIEGSRVRATYAL 151 + A S+E L GIEG+ + Sbjct: 331 KIANCRTLLRRNWRGEGDEAKAPPGLLMSLQDDMRHAMRAPSLEVLLGIEGASAGRYFQH 390 Query: 152 LAKQY--------GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPA 203 ++ G + R P D +N +S A + L A+ A G P Sbjct: 391 FSRMLRGGDGEGMGFDFTTRNRRPPK----DPVNALLSFAYAMLTREWTVALAAVGLDPY 446 Query: 204 IGFVHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTL 261 GF H + + D+ + + VR A + Sbjct: 447 RGFYHQPRFGRPALALDMMEPFRPLIADSTVLMAINNGEIRTGDFVRSA-GGCNLTDSAR 505 Query: 262 AKLIPLIEDVLAAGEIQP 279 + I E + P Sbjct: 506 KRFIAGFERRMEQEVTHP 523 >UniRef50_C2KP50 CRISPR-associated Cas1 family protein n=5 Tax=Actinomycetales RepID=C2KP50_9ACTO Length = 312 Score = 216 bits (550), Expect = 9e-55, Method: Composition-based stats. Identities = 71/303 (23%), Positives = 128/303 (42%), Gaps = 17/303 (5%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKT----------GIRTHIPVGSVACIMLEP 56 I L+DRVS ++L+Y Q+ I + IPV +A + L P Sbjct: 17 EQIRLEDRVSYLYLEYCQVIQNHTGVAAISEGNHDSEDREPLKRIIQIPVAGLAVLFLGP 76 Query: 57 GTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVV 116 GT ++ A+ ++ G +++ G G Y+ + S + QA L DE K Sbjct: 77 GTSITQPAMASCSRAGLTVIFSGGGGCPYYSHAMALTSSSRWAIAQAHLVADERNARKAA 136 Query: 117 RKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 + +++ + G ++ Q+RG+EGS ++ Y L++++ V R D D + Sbjct: 137 KFLYKRQLGIDIERELTISQMRGLEGSLIKKRYRELSREFKVNGFRR-----DTGGEDVL 191 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIA 236 NQ ++ LYG +A A G PA+G +H G S ++D+AD+ K + +P +F Sbjct: 192 NQALNLVNGILYGCAASACAALGVNPALGIIHRGDIRSLLFDVADLYKPNAALPISFRSV 251 Query: 237 RRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLPVS 296 ++ P + R R L +I ++ +VL + +P Sbjct: 252 SKDE--PLKFARKEMRRFIYEQNVLENMISILMNVLEPYLPTIKDDRLIDESGEVPGHKD 309 Query: 297 LGD 299 + Sbjct: 310 YSN 312 >UniRef50_C7QUZ4 CRISPR-associated protein Cas1 n=8 Tax=Cyanobacteria RepID=C7QUZ4_CYAP0 Length = 325 Score = 215 bits (549), Expect = 1e-54, Method: Composition-based stats. Identities = 51/295 (17%), Positives = 101/295 (34%), Gaps = 35/295 (11%) Query: 15 VSMIFLQY--GQIDVIDGAFVL----IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +S+++L + AF + D + + IP +V I+L ++ A+ A Sbjct: 1 MSILYLTQPDAVLSKKQEAFHVALKQEDGSWKKQLIPAQTVEQIVLIGYPSITGEALCYA 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP- 127 ++G + ++ G + ++ L Q + +E+ RL +V+ + + Sbjct: 61 LELGIPVHYLSCFGKYLGSALPGYSRNGQLRLAQYHVHDNEEQRLALVKTVVTGKIHNQY 120 Query: 128 -------------------APARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYD 166 ++ ++EQ+RG+EG + + W NGR Sbjct: 121 HVLYRYQQKDNPLKEHKQLVKSKTTLEQVRGVEGLAAKDYFNGFKLILDSQWNFNGRNRR 180 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 P D +N +S A L AA+ AG P IG++H + V D+ + + Sbjct: 181 PP----TDPVNALLSFAYGLLRVQVTAAVHIAGLDPYIGYLHETTRGQPAMVLDLMEEFR 236 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + +P + + S + E L P Sbjct: 237 PLIADSLVLSVISHKEIKPT-DFNESLGAYLLSDSGRKTFLQAFERKLNTEFKHP 290 >UniRef50_C3WS02 CRISPR-associated protein n=2 Tax=Fusobacterium RepID=C3WS02_9FUSO Length = 335 Score = 215 bits (549), Expect = 1e-54, Method: Composition-based stats. Identities = 39/291 (13%), Positives = 92/291 (31%), Gaps = 34/291 (11%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I+ Q + + ++ G IP+ ++ +++ G ++S A + G + Sbjct: 5 YIYEQGIVLRYKENRLLITYTNGDYKSIPIENIDNVVIFGGIQLSTACMHNLLIKGIHVT 64 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR------------- 123 ++ + G D+ Q + + D+ L + +K + + Sbjct: 65 FLSKTGSYFGRLESTSNINIDRQREQFRKSDDKKFCLAIGKKFIKGKATNQRTLLIRANK 124 Query: 124 ----------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDP 167 + +++E+L G+EG R + + ++ + Sbjct: 125 DLKSEILSSVINSMFGIIKDINDSKTIEELMGVEGYLARLYFNAINHIIDKKYSFKT--R 182 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKF 225 D N IS + L+ ++ G P F+H+ + + D+ + + Sbjct: 183 TKRPPKDPFNAVISFGYTLLHYEIFTTLVTKGLNPYAAFLHSDRHKHPALCSDLMEEWRA 242 Query: 226 DTVVPKAFEIARRNPGEPDRE-VRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 V A + N + +F + K K + E L Sbjct: 243 ILVDSMAIALLNNNKIAYEDFNFDEKSGGVFLNKKACGKFVEQFEKRLRQE 293 >UniRef50_B9K7F7 CRISPR-associated protein, Cas1 family n=1 Tax=Thermotoga neapolitana DSM 4359 RepID=B9K7F7_THENN Length = 329 Score = 214 bits (545), Expect = 4e-54, Method: Composition-based stats. Identities = 51/309 (16%), Positives = 98/309 (31%), Gaps = 35/309 (11%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKT---GIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + G++ + ++ + R IPV ++ I ++ + AA+ Sbjct: 7 NYYVFSSGRVKRHENTILIEYQKAGMQQRKFIPVENIDQIFFLGEVDLNSKFLDFAAKNN 66 Query: 73 TLLVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE---LRFGE 126 +L + G Y + L+ Q + LD + RL + RK E F Sbjct: 67 IVLHFFNYYG--YYTGSFYPREKFISGELLVRQVEHYLDSEKRLTLARKFVEGAVHNFKR 124 Query: 127 PAPA------------------RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPK 168 +++ +L E + Y+ + G + R P Sbjct: 125 NIEKRGFDITDKISEYLERTKYAKTIPELMSCEAHARKLYYSTWEEITGWPFEERSMQPP 184 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFD 226 + +N IS S Y + + P I ++H K S DIA+I K Sbjct: 185 ----LNELNALISFGNSLTYSIVLKELYFTHLNPTISYLHEPGTKRFSLALDIAEIFKPI 240 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDA 286 V F++ + ++ R +F + + I E++L + + Sbjct: 241 FVDRIIFKLINLKKIDREKHFLQEARGVFLNEEGRRLFIEEFENMLQQTILHRKLKRKIK 300 Query: 287 QPVAIPLPV 295 I L Sbjct: 301 YQSLIRLEA 309 >UniRef50_B7KMR5 CRISPR-associated protein Cas1 n=3 Tax=Chroococcales RepID=B7KMR5_CYAP7 Length = 558 Score = 214 bits (544), Expect = 5e-54, Method: Composition-based stats. Identities = 61/318 (19%), Positives = 107/318 (33%), Gaps = 46/318 (14%) Query: 3 WLPLNPIPLKDRVSMIFL-QYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV 60 W P+ P+ D +I + + G ++ + I +G V+ ++L +++ Sbjct: 210 WHPVRLFPVDDEREVIHVLEPGTRVGRTGEQLKISRPNQPDEKIAIGQVSQVVLHSFSQI 269 Query: 61 SHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF 120 S AV A + +V G R S + + Q + D L++ RK+ Sbjct: 270 STQAVHFLAYKEVGIHFVSGGG-RYIGSIDARSRSIQRRVRQYQALSQPDFCLELARKLV 328 Query: 121 ELRFGEP-------------------------------APARRSVEQLRGIEGSRVRATY 149 R P +S++ L GIEG+ + Sbjct: 329 ACRGEGQRKFLMRGKRNKKGDSLALEKTIAQMKAVLKQVPQIQSLDSLLGIEGNLAALYF 388 Query: 150 ALLAKQY------GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPA 203 L+ + ++GR P D N +S S L AILA G PA Sbjct: 389 GALSNLLAENAPESLLFSGRNRRPPK----DRFNALLSFGYSLLIKDVMNAILAVGLEPA 444 Query: 204 IGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTL 261 +GF H + + D+ +I + V R+ + + + + ++ S Sbjct: 445 LGFYHQPRTQAPPLALDLMEIFRVPLVDMPVVTSINRSQWDIQADFDVRGQQVWLSDSGR 504 Query: 262 AKLIPLIEDVLAAGEIQP 279 K I L E A P Sbjct: 505 RKFINLYEQRKAETWKHP 522 >UniRef50_B8GLF6 CRISPR-associated Cas1/Cas4 family protein n=1 Tax=Thioalkalivibrio sp. HL-EbGR7 RepID=B8GLF6_THISH Length = 570 Score = 210 bits (535), Expect = 4e-53, Method: Composition-based stats. Identities = 39/315 (12%), Positives = 91/315 (28%), Gaps = 45/315 (14%) Query: 5 PLNPIPLKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P +D +++Q ++ + + + V+ +++ ++ Sbjct: 225 PRPLAVARDEALPLYIQARGAKLAKRGETLEVTVDDEKVQSVRLIDVSQVIVMGNVYITT 284 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 ++ Q + W G + + G + Q K + +E L + + + E Sbjct: 285 PCLQELMQREIPVSWHSHGGWFMGHTMGTGHKNVEIRTAQYKASFEEHQCLHIAKGLVEA 344 Query: 123 RFGE---------------------------PAPARRSVEQLRGIEGSRVRATYALLAKQ 155 + + ++++L GIEG+ + A+ Sbjct: 345 KIQNCRTLLRRNWKGEDKPVDLLDGLQVDIRKSRRASNLQELLGIEGAAASRYFGAFARL 404 Query: 156 YG---------VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGF 206 + R P D +N +S A + L A++ A G P GF Sbjct: 405 LKHSDAGPELTFDFTTRNRRPP----TDPVNALLSYAYALLTRSWTASLSAVGLDPYRGF 460 Query: 207 VHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKL 264 H + + D+ + + + P +A + + Sbjct: 461 YHQPRYGRPALALDMMEPFRPLIADSSVIQAINNGEVRPSDFQSVAGSVALTND-GRKRF 519 Query: 265 IPLIEDVLAAGEIQP 279 I E ++ P Sbjct: 520 IATFERRMSHEITHP 534 >UniRef50_A5ILM3 CRISPR-associated protein, Cas1 family n=3 Tax=Bacteria RepID=A5ILM3_THEP1 Length = 327 Score = 210 bits (535), Expect = 5e-53, Method: Composition-based stats. Identities = 46/291 (15%), Positives = 93/291 (31%), Gaps = 32/291 (10%) Query: 16 SMIFLQYGQIDVIDGAFVLI----DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 + G+I + + ++ D + IPV +V I ++ + AA+ Sbjct: 4 NYYVFSSGRIRRRENSILIEYQDRDGKQQKRFIPVENVDQIFFLGEVDLNSKFLDFAAKN 63 Query: 72 GTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE--------- 121 +L + G + + + L+ Q + LD + RL + RK E Sbjct: 64 NIVLHFFNYYGYYTGSFYPREKFLSGELLVRQVEHYLDNEKRLSLARKFVEGAIHNFKRN 123 Query: 122 ------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 + E ++ +L E + Y+ + R P Sbjct: 124 IEKRGFDIVSKISEYQERIKHVATIPELMSCEAHARKLYYSTWEDITDWPFEERSMQPP- 182 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDT 227 + +N IS S Y V + P + ++H K S DI++I K Sbjct: 183 ---LNELNALISFGNSLTYSVVLKELYHTHLNPTVSYLHEPGTKRFSLALDISEIFKPIF 239 Query: 228 VVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 V F++ + + +F + + + E++L + Sbjct: 240 VDRIIFKLINLGKIKRENHFLQESNGVFLNDEGRRIFVEEFENMLQQTVLH 290 >UniRef50_UPI0001C16754 protein of unknown function DUF48 n=1 Tax=Cylindrospermopsis raciborskii CS-505 RepID=UPI0001C16754 Length = 334 Score = 208 bits (530), Expect = 2e-52, Method: Composition-based stats. Identities = 45/288 (15%), Positives = 93/288 (32%), Gaps = 35/288 (12%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + Q + ID + T +P+ + +++ ++ A+++ Q + Sbjct: 5 FLLEQDTLVRQIDERLEIFKHDKRLTDVPLCKLESVVVYGRVILTIPALKILNQRSIPIT 64 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------LRFGE 126 ++ E G V + Q + + D L + R++ RF Sbjct: 65 YLSEEGRTVATLLPEPNPNAILRSKQYQASFDTHKTLAIAREIIRGKLFNQHTILARFSR 124 Query: 127 PAPAR--------------------RSVEQLRGIEGSRVRATYALLAKQY-GVTWNGRRY 165 S+ +LRG EG + + +L + G W+ Sbjct: 125 QNERSEKVISALKSLKACQKSIDQTTSLNELRGYEGQGASSYFGVLGELLTGTPWSF--S 182 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADII 223 D +N + + LYG AA+ G P GF+H + + D+ + Sbjct: 183 HRTRRPPTDPVNALLGFGYALLYGDCRAALHTVGLDPYQGFLHGERYGRANLALDLMEEF 242 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 + V ++ R N E + + + + + + K I E Sbjct: 243 RPIFVDGLVLQLLRNNSLEKESFINYPGGAVHLNEQGMKKFIQSYEQR 290 >UniRef50_C9M4E6 CRISPR-associated protein cas1 n=1 Tax=Lactobacillus helveticus DSM 20075 RepID=C9M4E6_LACHE Length = 332 Score = 207 bits (528), Expect = 3e-52, Method: Composition-based stats. Identities = 48/293 (16%), Positives = 91/293 (31%), Gaps = 41/293 (13%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 G+++ D L G + + I L + + AQ+ + Sbjct: 7 YYLFSSGELERKDNTVRLTRSDGKYKDLKIEVTRDIYLFGEVSTNTKCLNYLAQMKIPVH 66 Query: 77 WVGEAGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVVRKM-------------- 119 + G Y L+ Q + D RL +K Sbjct: 67 FFNYYG--FYTGSFYPKEQNVSGTLLIQQVQAYTDPKRRLYYAKKFVLGAAKNLLRNLKY 124 Query: 120 FELRFGEPAPARR-------------SVEQLRGIEGSRVRATYALLAKQY--GVTWNGRR 164 ++ R + + VE+L G+EG+ R YA + V ++ R Sbjct: 125 YQRRGRNLDDSIKEITSLIRQIDQVHDVEELMGVEGTIHRRYYASWQSVFIPDVDFSKRV 184 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADI 222 P D + +N IS +Y + I P I ++H+ + S DI++I Sbjct: 185 RRPPD----NMVNTLISFLNGLMYTTCLSEIYVTQLNPTISYLHSPMDRRFSLCLDISEI 240 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 K V F + +N + + + S K+ ++ + + Sbjct: 241 FKPMIVDRLIFSLINKNMIS-EEDFNKESNYCYLSEKSKRIIVSEYDKYMKQK 292 >UniRef50_C4G3M4 Putative uncharacterized protein n=1 Tax=Abiotrophia defectiva ATCC 49176 RepID=C4G3M4_ABIDE Length = 340 Score = 207 bits (528), Expect = 3e-52, Method: Composition-based stats. Identities = 51/306 (16%), Positives = 105/306 (34%), Gaps = 39/306 (12%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ Q +I I G F+L K G +P + I + + ++ A++ + Sbjct: 7 MSCLYVVEQGSKIKHIGGQFILEVKDGENRVVPDEILESISIFGNSVLTTQAIKACLEKN 66 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP----- 127 + ++ G D+L QA L+ + D LK + + + + Sbjct: 67 INVSFLSTKGRYFGKLMSNTATNPDRLKAQAYLSDNIDECLKFAKIILKAKINNQDVILR 126 Query: 128 -----------------------APARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNG 162 + + ++ G EG R + L+K ++G Sbjct: 127 RYAKSSEADISSHIKDLKIYEEHIEKGKDINKIMGYEGIAARTYFEALSKLIKPEFKFSG 186 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIA 220 R P D N +S S +Y + I +P IGF+H + + V D+ Sbjct: 187 RNKRPPK----DAFNSMLSLGYSLIYNEIFSEIENRNLSPYIGFIHKLKDRHPALVSDLI 242 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDI-FRSSKTLAKLIPLIEDVLAAGEIQP 279 + + V + + N + + + F S + +++ IE+ L + Sbjct: 243 EEWRAVLVDATMMSLIQGNEILIEEFTKDEYSEAVFISDLAVKQIVRKIENKLRSQNNYL 302 Query: 280 PAPPED 285 E Sbjct: 303 EYLNEP 308 >UniRef50_C1DUM1 Crispr-associated protein Cas1 n=18 Tax=Bacteria RepID=C1DUM1_SULAA Length = 331 Score = 207 bits (526), Expect = 5e-52, Method: Composition-based stats. Identities = 46/295 (15%), Positives = 88/295 (29%), Gaps = 36/295 (12%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGI---RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + G+I + + +T +P+ + I + ++ A+ +Q Sbjct: 4 NYYINSNGRIRRKENTVYFETEKDGESLKTPLPINDIDTIFIFGEVDINTKAINYLSQYD 63 Query: 73 TLLVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE-------- 121 + + G Y+ L+ Q K +D+ R + E Sbjct: 64 IPMHFFNYYG--YYSGSFLPRKKNVSGSLLVEQVKHHIDDSKRQMLAISFIEGAVHHILR 121 Query: 122 ---------LRFGEPAP-------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRY 165 F +++E+L IEG+ Y L + Sbjct: 122 NLRKSGISVENFQNIEKDLLPKIFETKTIEELMAIEGNIREHYYQLFNTVIKNK-DFFIE 180 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADII 223 + + IN IS S +Y I P I ++H+ K S DIA+I Sbjct: 181 KREKRPPTNPINALISFGNSIMYNTVLTEIYRTQLDPTISYLHSPQEKRFSLSLDIAEIF 240 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 K + P F + + D + + + K + E+ L+ Sbjct: 241 KPFIIDPLIFNLIKTGQITID-DFDKDLNYTYLNENGRKKFLKAYEERLSKTVKH 294 >UniRef50_B9YDC3 Putative uncharacterized protein n=1 Tax=Holdemania filiformis DSM 12042 RepID=B9YDC3_9FIRM Length = 343 Score = 207 bits (526), Expect = 6e-52, Method: Composition-based stats. Identities = 38/311 (12%), Positives = 95/311 (30%), Gaps = 46/311 (14%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ + +++ + + V++++ + IP+ +++ I++ +S A + Sbjct: 1 MRKIGNTLYITLPDVYLSLDGENVVILNQQKVIKRIPLHNLSSIVMFNYQGISPALMGKC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------- 119 L ++ AG + Q ++ + L + + M Sbjct: 61 MSQNITLSFLSPAGYFLGRVVGEYQGNVLLRKKQILVSENNQQSLLIAKNMILAKVYNSK 120 Query: 120 ---------FELRFG---------------EPAPARRSVEQLRGIEGSRVRATYALLAKQ 155 + LR +S ++LRG+EG+ + ++ Sbjct: 121 WILERAIRDYPLRIDIEKMRVIFKKLSEILNSIEKVQSHDELRGLEGTAAKLYFSSFDDL 180 Query: 156 Y-----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 + R P + +N +S + L +A+ + G IGF H Sbjct: 181 ILRQKEDFVFTTRTRRPP----LNKVNALLSFVYTLLSHDCASALESVGLDCYIGFFHVD 236 Query: 211 KPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 +P S D+ + + + R + ++ + K+I Sbjct: 237 RPGRMSLALDLMEEFRPCLADRFVLSLINRKEIDSCDFFDYENGAVYLNDDGRKKVIQAW 296 Query: 269 EDVLAAGEIQP 279 + P Sbjct: 297 QLKKNEELTHP 307 >UniRef50_B7IHY4 Cas crispr-associated protein Cas1 n=4 Tax=Bacteria RepID=B7IHY4_THEAB Length = 330 Score = 206 bits (525), Expect = 7e-52, Method: Composition-based stats. Identities = 49/291 (16%), Positives = 94/291 (32%), Gaps = 39/291 (13%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + G++ D G + +P+ ++ + + ++ + A++ G L+ Sbjct: 5 IYIFNSGELKRKDNTICFESSEGKKY-LPIENINNLWIFGEVELNKRFLDFASENGILIH 63 Query: 77 WVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------------FE 121 + G + +L QA+ LD + R+K+ RK ++ Sbjct: 64 FFNFYGYYTGTFYPREHLNSGFVILKQAEHYLDNEKRIKLARKFVEGAVENLLVVLKYYK 123 Query: 122 LR---FGEPAP----------ARRSVEQLRGIEGSRVRATYALLAKQYG---VTWNGRRY 165 R + + + VE L EG+ Y K + R Sbjct: 124 NRGYELEDEIDDIKNKKEGIYSSQDVETLMSFEGNIRDIYYKCFDKITKKEEFAFEKRSR 183 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADII 223 P + +N IS S LY I P IG++H+ + + D+A+I Sbjct: 184 RPP----LNKMNSLISFGNSLLYTTVLGEIYQTQLDPRIGYLHSTNNRRFTLNLDVAEIF 239 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 K V F + +N ++ I ++ K I L Sbjct: 240 KPSIVDRVIFSLVNKNVLSS-KDFDKQLNGIVLNNSGKKKFIEEYNKKLET 289 >UniRef50_D2NTT1 Uncharacterized protein predicted to be involved in DNA repair n=2 Tax=Rothia mucilaginosa RepID=D2NTT1_9MICC Length = 560 Score = 206 bits (525), Expect = 7e-52, Method: Composition-based stats. Identities = 50/295 (16%), Positives = 88/295 (29%), Gaps = 40/295 (13%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 + Q + V G ++ + +P+ V +++ VS A +R ++W Sbjct: 237 LTTQGSRATVQQGRLIVQHLGETISSVPLERVHSLVVHGNIDVSSALLRELMWRNCTIIW 296 Query: 78 VGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA--------- 128 G RVY QPG + Q + L L + M + A Sbjct: 297 CSSTG-RVYGWSQPGTGPNGLARVQ-QHVLSAQGYLPIASAMIASKIANQATLLRRNGHA 354 Query: 129 --------------PARRSVEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYD 166 P S+ +L G+EG + A + G W GR+ Sbjct: 355 ADVCRTMRDIQKNTPQATSIPELLGLEGEAASLYFGNFATMLKEDALTELGWIWTGRQGR 414 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 D IN ++ A L IL G P GF+H+ + D+ + + Sbjct: 415 ----GANDPINILLNYAYGMLSSECIRGILTCGLDPHAGFLHSSSRNKPALALDLMEEFR 470 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + R + + + + +I E + P Sbjct: 471 AVIADSVVVSLINRKQVKAAHFTTVGGSYR-LTPEGRKTIIAAFEKRMVTEITHP 524 >UniRef50_Q1CWU5 CRISPR-associated protein Cas1 n=1 Tax=Myxococcus xanthus DK 1622 RepID=Q1CWU5_MYXXD Length = 342 Score = 204 bits (520), Expect = 3e-51, Method: Composition-based stats. Identities = 43/293 (14%), Positives = 88/293 (30%), Gaps = 43/293 (14%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I + +++ V+ + + +P+ + ++ ++ + + G + Sbjct: 9 FITAEGTRLNKEGECVVVTVQDQKKAEVPLRHLRSVVCLTRAWLTPELMESCLEAGIHVS 68 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF---------------- 120 + G G + G Q + A D + + R + Sbjct: 69 FFGMTGRFLARVEGVPGGNVLLRRQQYRAADDIARSVAISRALVVGKLGNARQFVLHARR 128 Query: 121 -------------ELRFGEPAPARRSVEQL---RGIEGSRVRATYALLAKQY-----GVT 159 R E A VE L RG+EG R + G Sbjct: 129 DAAPERQEALSETARRLSEHLRALTRVEDLVQVRGLEGIAARDYFESFPALLKKSAQGFE 188 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 ++GR P + +N +S + L A+ G PA+GF+H +P S Sbjct: 189 FDGRNRRPPR----NPLNAMLSFGYALLAQDCAGALTGVGLDPAVGFLHEDRPGRLSLAL 244 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 D+ + + V F + R +P + + + ++ Sbjct: 245 DLMEEFRAPVVDRLVFSLVNRGQLKPGDFRTESAGAVLLKDDARKTFLVAYQE 297 >UniRef50_C0BZ41 Putative uncharacterized protein n=3 Tax=Clostridiales RepID=C0BZ41_9CLOT Length = 343 Score = 204 bits (518), Expect = 4e-51, Method: Composition-based stats. Identities = 44/311 (14%), Positives = 100/311 (32%), Gaps = 46/311 (14%) Query: 11 LKDRVSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++ +++ + + V+++ +P+ ++ I+ T S A + Sbjct: 1 MKKLLNTLYVTSTNRYLFLDGENVVILEDQEEIGRVPLHNLEGIVTFGYTGASPALMGAC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------- 119 A+ L ++ G + Q +++ + +K+ R Sbjct: 61 AERNIDLSFMSGNGRFLARVSGEVRGNVTLRKEQYRISEQKKESIKIARNFITGKVYNAK 120 Query: 120 ---------FELRFGEPAPARRS---------------VEQLRGIEGSRVRATYALLAKQ 155 + LR +S E+L GIEG +++ + Sbjct: 121 WVLERAARDYPLRLDVDRIKEKSAFMSGNLPKIRECEDAERLLGIEGESASLYFSVFDEL 180 Query: 156 Y-----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 ++ GR P D +N +S A + L G+ +A+ + G P +GF HT Sbjct: 181 ILQQKDEFSFGGRNKRPP----LDNVNAMLSFAYTLLTGMCASALESVGLDPYVGFYHTD 236 Query: 211 KPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 +P S D+ + ++ + + + + T + + Sbjct: 237 RPGRVSLALDLMEELRSVMADRFVLTLINKKIIGASGFSKKESGAVIMDDDTRRQFLSHW 296 Query: 269 EDVLAAGEIQP 279 +D P Sbjct: 297 QDKKKETITHP 307 >UniRef50_B0K547 CRISPR-associated protein Cas1 n=12 Tax=Bacteria RepID=B0K547_THEPX Length = 330 Score = 203 bits (517), Expect = 6e-51, Method: Composition-based stats. Identities = 53/298 (17%), Positives = 97/298 (32%), Gaps = 39/298 (13%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 + ++ G++ D + G + IPV + + IM+ ++ + Q Sbjct: 2 KKTIYIFSDGELKRKDNTLFFEGENGRKF-IPVENTSEIMVFGEVSLNKRLLEFLTQSEI 60 Query: 74 LLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------- 121 +L + G V + + +L QA+ D RL + +K E Sbjct: 61 ILHFFNHYGYYVGSYYPREHLNSGYMILRQAEHYNDGSKRLYLAQKFVEGAYKNIRQVLK 120 Query: 122 ----------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY---GVTWNG 162 + GE + ++ +L IEG+ Y + ++ Sbjct: 121 YYSNRGKDLEDVIYSIEKLGESVDSTSTINELMAIEGNIREYYYKAFDEIIQNPDFKFDF 180 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIA 220 R P + +N IS S +Y T + I P IGF+H + S D++ Sbjct: 181 RSKRPPQ----NFLNTLISFGNSLMYTTTLSEIYKTHLDPRIGFLHATNFRRFSLNLDVS 236 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 +I K V F + + + A + K + ED LA Sbjct: 237 EIFKPIIVDRTIFTLLSKKMVTKEDFEEDAE-GLLLKEKGKKVFVQEFEDKLATTIKH 293 >UniRef50_O57912 Putative uncharacterized protein PH0173 n=1 Tax=Pyrococcus horikoshii RepID=O57912_PYRHO Length = 317 Score = 202 bits (515), Expect = 9e-51, Method: Composition-based stats. Identities = 55/302 (18%), Positives = 110/302 (36%), Gaps = 29/302 (9%) Query: 16 SMIFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 S I++ Q G ++ +++ ++ +P+ +++ I ++ A++L + Sbjct: 3 SPIYITQPGILERKANTLFFVNE-EMKRALPINTISEIHCFAPVTLTSGAIKLLSDNDVP 61 Query: 75 LVWVGEAGVRVYASGQPGGA-RSDKLLYQAKLALDEDLRLKVVRKMFE------------ 121 + + + G + ++ QA +D + RL + R++ E Sbjct: 62 VHFYNKYGYYRGSYLPAESQISGSIVVAQASHYIDNEKRLYIAREILEGTRASMISLLKS 121 Query: 122 -----LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGV-TWNGRRYDPKDWEKGDT 175 + S+E+L GIE + YA + ++ R P D Sbjct: 122 QRAEYKDLADIDLKGESIEELMGIESQLWKTFYAHFSHLLKFFNFDERNRRPPR----DE 177 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAF 233 IN IS S LY VT + I P I ++H + S D+A+I K V Sbjct: 178 INAMISYGNSVLYTVTLSEIRKTYLHPGISYLHEPRERRYSLALDLAEIFKPIVVFRVIL 237 Query: 234 EIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPL 293 + + + VR + +S+ + + I D L+ + P + + I L Sbjct: 238 RLVNKRIIREEHFVRD--VGVLLNSEGVKIFLGGINDELSRKVLHPTLRRKVSVRYLIRL 295 Query: 294 PV 295 Sbjct: 296 EA 297 >UniRef50_A1A2M8 CRISPR-associated DNA polymerase n=14 Tax=Bacteria RepID=A1A2M8_BIFAA Length = 343 Score = 202 bits (515), Expect = 1e-50, Method: Composition-based stats. Identities = 44/311 (14%), Positives = 93/311 (29%), Gaps = 46/311 (14%) Query: 11 LKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++ +F+ + + + + V+ +P+ S+ IM S A + Sbjct: 1 MKQLLNTLFVMTEDAYLALENDNVVIYQNDQTLAKVPLRSIEGIMCFSYKGASPALMGRC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE-- 126 ++G + + G + Q ++A DE L+ + + Sbjct: 61 GKLGVSMAFYSPRGHYYCSVLGEENRNVLLRREQFRVADDEQKSLRYAKSFIVGKLYNAK 120 Query: 127 -------------------------------PAPARRSVEQLRGIEGSRVRATYALLAKQ 155 A ++++LRG+EG + + Sbjct: 121 WVLERTKRDHALRVNIDRLAEQSGKLSAALSKARKSLTIDELRGVEGLAAKDYFYAFDDL 180 Query: 156 Y-----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 + R P D +N +S S L AA+ G P +GF+HT Sbjct: 181 VLKNKDDFFFTSRSRRPP----LDRLNALLSFCYSILTNDCIAALQGVGLDPYVGFMHTD 236 Query: 211 KPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 +P S D+ + + + +P + S K++ Sbjct: 237 RPGRASLALDLVEEFRPVLADRFVLTLVNTGAVKPGDFEIRENGGVLLSDFGRKKVLTAW 296 Query: 269 EDVLAAGEIQP 279 + + + P Sbjct: 297 QKKKSDQILHP 307 >UniRef50_A7HNI6 CRISPR-associated protein Cas1 n=1 Tax=Fervidobacterium nodosum Rt17-B1 RepID=A7HNI6_FERNB Length = 334 Score = 202 bits (514), Expect = 1e-50, Method: Composition-based stats. Identities = 44/296 (14%), Positives = 97/296 (32%), Gaps = 39/296 (13%) Query: 15 VSMIFLQYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 +++ L+ G + DG ++ + IP+ + I L ++ + Sbjct: 1 MTIYVLEQGTVLAKKDGRMIITKAKQVLDEIPLKKIERINLLGNITLTSQMINYCLDNKI 60 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG-------- 125 ++++ + G L Q + A D+ +L++ + + + + Sbjct: 61 EVIFMTQHGRYRGKLYTDEYRNVLLRLKQYERATDKQFQLEISKSIVQGKLQNYYNFLTQ 120 Query: 126 ---------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVT---WN 161 E ++V+++RG EG + ++ K +N Sbjct: 121 KSKNLPKGLLSEERAGIRTVIEKVNKAKTVDEVRGYEGIGSKIYFSGFKKCIRTEELTFN 180 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDI 219 GR P D IN +S LY AI A G P G +HT S ++D+ Sbjct: 181 GRTAHPPK----DEINAMLSLGYYFLYVEMLLAINAVGLDPYFGNLHTIDVSKQSLLFDL 236 Query: 220 ADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + + + + + + + + + + K I E ++ Sbjct: 237 VEEFRCVIIDNFVLNLINLKTIKKEDFEKRENDIYYFTKDGMKKYITEYEQMMKQK 292 >UniRef50_C9LM09 CRISPR-associated protein Cas1 n=1 Tax=Dialister invisus DSM 15470 RepID=C9LM09_9FIRM Length = 331 Score = 201 bits (511), Expect = 3e-50, Method: Composition-based stats. Identities = 42/300 (14%), Positives = 88/300 (29%), Gaps = 35/300 (11%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S I++ +++ G +V+ + +P V + L ++S + + + Sbjct: 1 MSWIYVTEPGAKLNRQGGRYVISRENETICEVPSAVVEGVTLFDSIQISSSVIVDFLERN 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM------------- 119 L W+ G + Q D+D L + +++ Sbjct: 61 IPLTWISSTGRFFGRLESTDHQNVLRQKEQFDALADKDFCLALAKRVVFGKVYNQRTILR 120 Query: 120 -FELR---------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGR 163 + R + SVE++ G EG R + + + R Sbjct: 121 NYNRRAEDPFIEKVRSDIRILADKLHMAHSVEEVMGYEGMMARIYFQAIGHILPEEF--R 178 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIAD 221 D N +S + L +AI+ G P IGF+H + + D+ + Sbjct: 179 FEKRTKRPPRDYFNSLLSFGYTLLMYDFYSAIVNCGLHPYIGFLHALRNGHPALASDLME 238 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 + V + D V+ I+ + + E + Sbjct: 239 PWRPAVVDAFCLSLVTHREISKDYFVKGENGGIYLNRIGRRIFLQAYERKMRTVNRYFQG 298 >UniRef50_Q2SIC8 CRISPR-associated protein Cas1 n=6 Tax=Gammaproteobacteria RepID=Q2SIC8_HAHCH Length = 338 Score = 200 bits (510), Expect = 4e-50, Method: Composition-based stats. Identities = 41/306 (13%), Positives = 87/306 (28%), Gaps = 41/306 (13%) Query: 11 LKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K + +++ + V+ P +VA I VS + Sbjct: 1 MKKLQNSLYVTRQESYLHKERETIVIKQGKDKLAQFPAHAVANIFCFGQISVSPFLMGYC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA 128 + G LV+ E G + Q +++ + RL + R + + Sbjct: 61 GEQGIGLVFFTEYGKFLARIQGRQSGNVLLRREQYRVSDKDTPRLDIARSIVLAKIANSR 120 Query: 129 ------------------------------PARRSVEQLRGIEGSRVRATYALLAKQY-- 156 +++ LRG+EG + + + Sbjct: 121 RVLQREVRNKGAHPSLDEAIARLANCLRRCERPDNLDILRGLEGEAAAIYFGVFNQLLKQ 180 Query: 157 -GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL-- 213 G + GR P D +N +S + + +A+ G P +G++H +P Sbjct: 181 DGFDFKGRVRRPP----TDPVNALLSFLYTLVAQEISSALQGVGLDPYVGYLHVDRPGRV 236 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 DI + + + R + A + K++ ++ Sbjct: 237 GLALDILEEFRAWWCDRLVLTLINRKEVKASDFDIEASGAVRLKEDARRKVLVAYQEKKQ 296 Query: 274 AGEIQP 279 P Sbjct: 297 EEISHP 302 >UniRef50_B1X158 DUF48-containing protein n=12 Tax=Cyanobacteria RepID=B1X158_CYAA5 Length = 330 Score = 200 bits (509), Expect = 5e-50, Method: Composition-based stats. Identities = 39/319 (12%), Positives = 90/319 (28%), Gaps = 42/319 (13%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + +++ Q + + + K + I + + +++ ++++ A+R Sbjct: 1 MKTLYVSQQGCYVKLDQETLKVEKKRQLLAEIQLPLIEQVLIFGKSQMTTQAIRACLWRN 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------- 125 +V++ G YQ +L D RL R++ + + Sbjct: 61 IPIVYLSRMGYCYGRIMSLKRGYRHLTRYQQQL--DFSQRLLTARELVKAKLKNCRVILQ 118 Query: 126 ----------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQY---GVTW 160 E A ++E+L G EG ++ + Sbjct: 119 RQQRRLQSDKLLFAIDSLNYLIEQANLAETIERLMGFEGVGASTYFSAFGDCLTPSEFIF 178 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 R P G+ +N +S L+ A I G P +H G + D Sbjct: 179 LARSRRPP----GNPVNAMLSFGYQVLWNHLLALIELQGLDPYQACLHQGSERHAALASD 234 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 + + + V + R D + + ++ K + + Sbjct: 235 LIEEFRAPMVDSLVLYLVNRRIMNADDDFEYHDGGCYLNNTGRKKYLKHFVQRMEETVQT 294 Query: 279 PPAPPEDAQPVAIPLPVSL 297 P + + + Sbjct: 295 TPDEKQPRWDLLMQQVKRY 313 >UniRef50_A1BI39 CRISPR-associated protein Cas1 n=5 Tax=Chlorobiaceae RepID=A1BI39_CHLPD Length = 731 Score = 199 bits (506), Expect = 1e-49, Method: Composition-based stats. Identities = 42/300 (14%), Positives = 97/300 (32%), Gaps = 43/300 (14%) Query: 15 VSMIFL-QYGQIDVIDG-AFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ ++L + G + DG F + + + V + +++ ++ ++ Q Sbjct: 389 LNTLYLQEQGSLMRKDGERFSIEKDGSVINEVIVRRIEQVVIFGNVALTTPVMQYCLQNE 448 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE------ 126 + ++ + G ++ + ++DE L+ R + + Sbjct: 449 IPVTFLSQHGKYFGRLEATTADNAEMQRFHFLRSIDEPFALETARSIVAAKISNSKTMIR 508 Query: 127 -----------------------------PAPARRSVEQLRGIEGSRVRATYALLAKQY- 156 A A ++ LRGIEG + + Sbjct: 509 RRKTVVQDRDSTLQNKMAYNLDIMADLALKAEASTDIDALRGIEGKASALYFECYGMLFS 568 Query: 157 -GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL-- 213 + ++ R + D +N +S + L+ + + A+G P IGF+H + Sbjct: 569 KNLPFHTRSFLRVRRPPTDPVNSLLSFGYTMLHTNIFSMVQASGLNPYIGFLHAERKGNP 628 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPG-EPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 + V D+ + + V R E D R F S+ + + + E + Sbjct: 629 ALVNDLVEEFRT-IVDSLVLYTLNRGLLQEKDFYYRKDEPGCFLSNDARKRFLNIFETRM 687 >UniRef50_C2GEC7 Crispr-associated protein Cas1 n=2 Tax=Corynebacterium RepID=C2GEC7_9CORY Length = 533 Score = 198 bits (504), Expect = 2e-49, Method: Composition-based stats. Identities = 50/295 (16%), Positives = 88/295 (29%), Gaps = 40/295 (13%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 + +Q + + +G ++ P+ V +++ +S A R +VW Sbjct: 210 LTVQGSRASIRNGRVIVEKAGERLADAPLERVQGVVIHGNVDISSALHRNFLWHNVPVVW 269 Query: 78 VGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA--------- 128 G RVY P + Q + L L + R M + A Sbjct: 270 CSTTG-RVYGYSCPTDGPNAAARVQ-QHVLSSRGCLPIARGMVNAKIMNQATLLRRNGDS 327 Query: 129 --------------PARRSVEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYD 166 + QL GIEG ++ + Q G TW GR Sbjct: 328 DRTVQLLIDAGTCSLEAKDNRQLFGIEGDAAALYFSTFSTMLNKAQIDQLGWTWTGRHGR 387 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 D IN ++ + L A+L+ G P GF+H+ + D+ + + Sbjct: 388 ----GASDPINILLNYSYGLLRAEVIRALLSCGLDPHAGFLHSSGRNKPALALDLMEEFR 443 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + R + + R + K+I E + P Sbjct: 444 APVSDSVVISLINRREIKDTDFTHIHGVARLRDT-GRKKIIRAFERRIQTSIKHP 497 >UniRef50_Q2RY11 CRISPR-associated protein, Cas1 family / CRISPR-associated exonuclease, Cas4 family n=2 Tax=Rhodospirillum rubrum ATCC 11170 RepID=Q2RY11_RHORT Length = 580 Score = 198 bits (504), Expect = 2e-49, Method: Composition-based stats. Identities = 46/295 (15%), Positives = 84/295 (28%), Gaps = 43/295 (14%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 +I D V+ + + + ++ ++L ++ A+ + + W+ Sbjct: 255 ARIGKKDYTLVIQVEGEADRSLALDEISEVVLAGPVSLTTPAIHELLRREIPVAWMSSGF 314 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE--------------------- 121 + ++G G + Q LA DE R R + Sbjct: 315 WFLGSTGGQGPRSAAVRTAQYALAGDERRRQAFARDLVSAKIRNGRTLLRRNWRGAEAER 374 Query: 122 -------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQY--------GVTWNGRRYD 166 R E A + L GIEG + + + + R Sbjct: 375 QIALDRLARLAERATTAETTACLLGIEGEAAAVYFRAFPQLFTQAVTTLPAFAFERRNRR 434 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 P D +N C+S + L +A+ AG P GF HT +P + D+ + + Sbjct: 435 PP----ADPVNACLSLCYAVLTRTLSSALSIAGLDPWKGFYHTERPGRPALALDLIESFR 490 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + + + A LI E L P Sbjct: 491 PVLADSTVLMVLNNGEIGTN-DFLYAGGGCALKPNARRGLIAAYERRLDQETTHP 544 >UniRef50_C0QR16 Crispr-associated protein Cas1 n=23 Tax=Bacteria RepID=C0QR16_PERMH Length = 331 Score = 198 bits (504), Expect = 2e-49, Method: Composition-based stats. Identities = 42/300 (14%), Positives = 85/300 (28%), Gaps = 38/300 (12%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGI---RTHIPVGSVACIMLEPGTRVSHAAVRL 67 + R G++ + + + +P+ + I + ++ A+ Sbjct: 1 MSRR--FYINSNGRLRRKENTLYFETEQNGEIVKKALPINDIDVIYVFGELDINTKALNY 58 Query: 68 AAQVGTLLVWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFE--- 121 + + + G Y+ L+ Q + +D R + E Sbjct: 59 LSGYDIPIHFYNYYG--FYSGSFLPRKKNVSGSLLVEQVRHYIDNKKRQYLAVSFVEGAA 116 Query: 122 ----LRFGEPAP-----------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTW 160 + R+ E+L +EG+ Y L K Sbjct: 117 YHILRNLRKQNINSEDFEEIEKDLFPKIFETRNTEELMALEGNIRERYYQLFNKIINNE- 175 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYD 218 + + + IN IS S +Y I P I ++HT K S D Sbjct: 176 DFFMEKREKRPPDNPINALISFGNSIMYNTVLTEIYRTQLDPTISYLHTPQEKRFSLSLD 235 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 +A+I K + P F + ++ + + K + E+ L+ Sbjct: 236 LAEIFKPFIIDPLIFSLINNRQITI-KDFDKDLNYAYLNENGRKKFLKAYEERLSKTIKH 294 >UniRef50_D0LSW9 CRISPR-associated protein Cas1 n=1 Tax=Haliangium ochraceum DSM 14365 RepID=D0LSW9_HALO1 Length = 594 Score = 198 bits (503), Expect = 2e-49, Method: Composition-based stats. Identities = 45/307 (14%), Positives = 85/307 (27%), Gaps = 38/307 (12%) Query: 6 LNPIPLKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 + L+D + +Q + V+ K + + V+ I L ++ Sbjct: 257 RPLVSLRDDALPLHVQEHGAVVSKRAAELVIKRKGSELERVRIKDVSRINLHGSAHITLP 316 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 A++ A G + G + Q A DE L++ +++ + Sbjct: 317 ALQTALGNGIPVGLFTYGGWYYGRAQGHDHKNVLLRQAQFASAQDEGRCLRIAQRLVHAK 376 Query: 124 FGEPAP-------------------------ARRSVEQLRGIEGSRVRATYALLA----K 154 S L GIEGS R + + + Sbjct: 377 IKNSRVMLRRNSRALDRRILDDLSGHARRARQADSQATLLGIEGSAARLYFQNFSGMLRQ 436 Query: 155 QYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--P 212 +++ R P D +N +S + + L A + G+ P GF H + Sbjct: 437 DVPFSFDSRNRRPPR----DPVNALLSFSYALLTAEWTATLSTVGFDPYQGFYHQPRYGR 492 Query: 213 LSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 S D+ + + + D V A + + E L Sbjct: 493 PSLALDLMEEFRPLIADSVVIGAINNGVLDEDDFVVTATAAA-LKPAGRKRFLQAFERRL 551 Query: 273 AAGEIQP 279 P Sbjct: 552 DEQVTHP 558 >UniRef50_A7HMV0 CRISPR-associated protein Cas1 n=3 Tax=Thermotogaceae RepID=A7HMV0_FERNB Length = 329 Score = 198 bits (503), Expect = 3e-49, Method: Composition-based stats. Identities = 50/303 (16%), Positives = 102/303 (33%), Gaps = 41/303 (13%) Query: 15 VSMIFL-QYGQIDVIDGAFVLI--DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 + +++ + + A + + ++P+ +++ IM+ ++ + L + Sbjct: 1 MEALYVFKDSYLRKKSSALYIEPKSEDQKPIYVPLKNISSIMVFSEIEMNKKTLELFSYS 60 Query: 72 GTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKM----------- 119 + + G + + LL Q + D + R+ + R++ Sbjct: 61 QVPVFFFNYNGEYIGCFYPVEENKTGEMLLLQLQHYQDLEKRIIIAREILYGVADNILKI 120 Query: 120 ---FELRFGEPAPARRSVE-------------QLRGIEGSRVRATYALLAKQY---GVTW 160 ++ F E +E L IEG+ + Y L + T+ Sbjct: 121 LKSYKNDFPEINEKIEMIEKLKKTYHRQEEIASLMAIEGNIRKNYYEALGVIFSKKDFTF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 R P D IN IS + LY V + I PAI F+H S D Sbjct: 181 QERTARPP----ADEINAMISFGNTILYNVVLSEIFKTSLEPAISFLHEPNKRKFSLQLD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 IA+I K V + R+ + D + + ++ + K + +E+ L+ Sbjct: 237 IAEIFKPIIVDRTILTLVNRSMIKKD-DFKKVEGGVYLNETGKKKFVKALEEKLSETIDY 295 Query: 279 PPA 281 Sbjct: 296 ENG 298 >UniRef50_A1HM55 CRISPR-associated protein Cas1 n=2 Tax=Thermosinus carboxydivorans Nor1 RepID=A1HM55_9FIRM Length = 332 Score = 197 bits (500), Expect = 6e-49, Method: Composition-based stats. Identities = 48/294 (16%), Positives = 93/294 (31%), Gaps = 40/294 (13%) Query: 15 VSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + +++ I G F++ I IP+ + ++L +VS + + G Sbjct: 1 MRTLYVTDAGSHIQKNAGRFLVCKGDTILREIPLELLDNVVLFGSIQVSAKTITEFLKRG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------- 125 L W+ + G Q ++ D LK+ + + + + Sbjct: 61 ITLTWLSKTGEFYGRLESTRHIDIFLHRQQIRMGDRPDFCLKIAQAIIDAKIANCMTILR 120 Query: 126 ----------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWN 161 E P +E L G+EGS R + LA + Sbjct: 121 RYQRTANSPEVADHIHAMGIIAEKIPNVDKIETLLGLEGSAARHYFTALACLVPDDFAFK 180 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDI 219 GR P D N +S + L + AG P GF+H + + V D+ Sbjct: 181 GRNKQPPK----DPFNSLLSFGYTLLMYDFYTIVQNAGLHPYAGFLHKDRQGHPTLVSDL 236 Query: 220 ADIIKFDTVVPKAFEIARRNPGEPDREV-RLACRDIFRSSKTLAKLIPLIEDVL 272 + + + + R +P + ++ + + A+ I E + Sbjct: 237 MEEWRPSIIDSLVMSLIHRREIQPLDFLPPDKNGGVYLNREASAEFIAAYEKRM 290 >UniRef50_A5D0Y0 Uncharacterized protein n=40 Tax=cellular organisms RepID=A5D0Y0_PELTS Length = 344 Score = 196 bits (499), Expect = 8e-49, Method: Composition-based stats. Identities = 50/314 (15%), Positives = 93/314 (29%), Gaps = 39/314 (12%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 + ++ G++ D + G R IPV IM+ V+ + + Sbjct: 16 KKTLYIFSNGELSRKDNTLYFETEEGRRF-IPVEDTGEIMIFGEVDVNKKLLEFLSVKEI 74 Query: 74 LLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------- 121 L + G + + + L QA+ LDE+ RL + +++ Sbjct: 75 TLHFFNYHGYYMGSFYPREHLNSGYMTLKQAEHYLDEEKRLVIAKEIVRGAAKNIRQVLK 134 Query: 122 ----------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG---VTWNG 162 P R L +EG+ Y + + Sbjct: 135 YYYGREKDVGGKLNAIENLMAPIEECRDTSSLMALEGNIRDHYYRAFDEIVDNPDFAFQE 194 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIA 220 R P + +N IS S +Y + + I P IG++H + + DIA Sbjct: 195 RSRRPPK----NYLNTLISFGNSLIYTICLSEIYKTHLDPRIGYLHATNFRRFTLNLDIA 250 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 +I K V F + + + + I K + +++ L Sbjct: 251 EIFKPIIVDRLIFTLLGKKMITKE-DFDRGTEGIMMKEKARKCFVENLDEKLKTTINHRE 309 Query: 281 APPEDAQPVAIPLP 294 + I L Sbjct: 310 IGRPVSYRRLIRLE 323 >UniRef50_A9AYP8 CRISPR-associated protein Cas1 n=1 Tax=Herpetosiphon aurantiacus ATCC 23779 RepID=A9AYP8_HERA2 Length = 333 Score = 195 bits (497), Expect = 1e-48, Method: Composition-based stats. Identities = 54/309 (17%), Positives = 104/309 (33%), Gaps = 43/309 (13%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQV 71 + ++L Q ++ D +++ + IPV V I++ G +VSHAA+ AQ Sbjct: 1 MPTLYLNEQGTRLGKKDERLIILRGQELINDIPVIKVDRVIVMGQGVQVSHAAIVFLAQR 60 Query: 72 GTLLVWVGE-AGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP- 129 G L++ + G + G + L Q ++ + L + +V+ + + Sbjct: 61 GIPLIFTTQSGGSQKAMVSAGLGNNAALRLAQCRIVDNPHLAVPLVQAIVVGKVANQIQL 120 Query: 130 ---------------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWN- 161 +EQLRG+EG+ A + + + W Sbjct: 121 LERYGSDWGGMGLRAKQTMQHVIQQTQHMPDIEQLRGLEGAGAAAYWGTWSAVFKTAWGF 180 Query: 162 -GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVY 217 GR Y P D +N +S + L A+ A + P +G HT G+P S Sbjct: 181 AGRAYRPTP----DPLNALLSFGYTLLLNDLMTAVQALSFDPYLGVFHTVQFGRP-SLAL 235 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 D+ + + V ++ + + + + I E + Sbjct: 236 DLEEEFRPCIVDRMVLDVLDAGLLQM-SNFSRTEKGFLLNDRARKSFIQAYEQRMQTPIR 294 Query: 278 QPPAPPEDA 286 + Sbjct: 295 YQGTGNNEP 303 >UniRef50_C1XWQ6 CRISPR-associated protein, Cas1 family n=3 Tax=Thermaceae RepID=C1XWQ6_9DEIN Length = 326 Score = 195 bits (497), Expect = 1e-48, Method: Composition-based stats. Identities = 45/292 (15%), Positives = 87/292 (29%), Gaps = 34/292 (11%) Query: 17 MIFLQYGQIDVIDGAFVLID----KTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + L+ G + G + P+ V I++ V+ A ++ A+ Sbjct: 4 VYVLEDGYLAKDGGTLKVSKRGPGGGETLLEKPLIGVEEIVVLGNAVVTPALLKHCAEEN 63 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE------ 126 L +V G + + + Q LD +L + R+ + Sbjct: 64 IGLHYVSTGGRYFAGLTRTPAKNAPARVAQFAAHLDPTRKLALARRFVLGKLRNSLTLLR 123 Query: 127 ---------------PAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKD 169 LRG+EG+ + A G ++ R P Sbjct: 124 RNGAEGWEEIRWAIGELDKAADEGALRGLEGNAADVYFRSYAALLPEGFRFSERSRRPPR 183 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDT 227 D N +S A + L E+A+ AG P +G++H + S D+ + + Sbjct: 184 ----DPANSLLSLAYTFLAKECESALQVAGLDPYVGYLHEVRYGRASLALDLMEEFRSIL 239 Query: 228 VVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + + + A ++ K + E+ L P Sbjct: 240 ADSVVLSLLNNRRLTLE-DFDDAEGYPKLRKESFPKFLRAWEERLTDRVRHP 290 >UniRef50_A5UXM3 CRISPR-associated protein, Cas1 family n=4 Tax=Chloroflexaceae RepID=A5UXM3_ROSS1 Length = 332 Score = 195 bits (496), Expect = 1e-48, Method: Composition-based stats. Identities = 47/302 (15%), Positives = 94/302 (31%), Gaps = 39/302 (12%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQV 71 + +++Q + D ++ + +P+ + ++L G ++S A + + Sbjct: 1 MPTLYIQEQGVMVRKRDNQVLITKDGQTLSEVPLAKIDQVVLMGRGVQLSTALLIDLLER 60 Query: 72 GTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA-- 128 G + + + G R YA+ D + Q + D L++ + + + Sbjct: 61 GIPVTFTNQHGSRHYATLTAGPSRFGDLRIRQMQFVGAPDRALRLAKDIVSAKLTNQRRL 120 Query: 129 ------PARRS-----------------VEQLRGIEGSRVRATYALLAKQYGVTWN--GR 163 PA + V+ LRG EG+ A + W GR Sbjct: 121 LAATGWPAAATAIAQIDAALTAAANAPHVDMLRGHEGAAAAAYFGAWRASLPPVWGFGGR 180 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIAD 221 + P D IN +S + A+ G +G H S D+ + Sbjct: 181 AFYPPP----DPINAMLSFGYTLALHDVITAVQITGLDTYLGVFHVIEPGRPSLALDLLE 236 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACR--DIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + V ++ R N + R ++ L+ E +L P Sbjct: 237 EFRPLIVDRLVIDLVRTNAIGREHFHHPQERPDAVYLDDVGRTLLVQRYESMLQTKVRLP 296 Query: 280 PA 281 Sbjct: 297 GG 298 >UniRef50_C9M4Y8 CRISPR-associated protein Cas1 n=1 Tax=Jonquetella anthropi E3_33 E1 RepID=C9M4Y8_9BACT Length = 342 Score = 195 bits (496), Expect = 1e-48, Method: Composition-based stats. Identities = 48/310 (15%), Positives = 89/310 (28%), Gaps = 45/310 (14%) Query: 11 LKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K + ++L GQI A V+ K PV + I+ ++ A+ L Sbjct: 1 MKQLKNTLYLSYDEGQISCSGRALVIRAKDQAPQQFPVHILEQIVCFGSVMLTPDAMNLC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM--------- 119 + ++ G P Q + A D ++ Sbjct: 61 LANNVTINYLSVYGRFRGRISGPVRGNVLLRRMQFRRADDPVQTAELATAFLLSKISNAR 120 Query: 120 -FELRFGEPAPAR---------------------RSVEQLRGIEGSRVRATYALLAKQY- 156 LR + +LRG+EG + Sbjct: 121 TVLLRHARERETNVFDEAVRDMAGLLVKLKGFTITDLNELRGLEGDAANIYFRCFDSMIL 180 Query: 157 ----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP 212 +++GR P D IN +S S L + + G P++GF+H +P Sbjct: 181 KNRETFSFHGRSRRPP----SDPINALLSLGYSLLAAEITGVLESVGLDPSVGFLHKDRP 236 Query: 213 L--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAK-LIPLIE 269 S D+ + + V + R + + + + K + + E Sbjct: 237 GRPSLALDLMEEFRAVVVDRFVLALVNREQLQANDFETDEAGGVALTEKARKEIFLNEWE 296 Query: 270 DVLAAGEIQP 279 + I P Sbjct: 297 NRKRTEIIHP 306 >UniRef50_D1BQ37 CRISPR-associated protein Cas1 n=1 Tax=Veillonella parvula DSM 2008 RepID=D1BQ37_VEIPT Length = 331 Score = 195 bits (495), Expect = 2e-48, Method: Composition-based stats. Identities = 60/292 (20%), Positives = 106/292 (36%), Gaps = 37/292 (12%) Query: 17 MIFL-QYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 +++ G I G VL + IP+ +V+ ++L ++S + + G+L Sbjct: 4 TLYVMTPGISIRQDGGLLVLEKDHSVIKEIPIATVSNLVLGRTIQISTQVMFSLVKQGSL 63 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA------ 128 + +V V G +LL+Q +++ L + + + Sbjct: 64 IQFVDHKYQLVGTLG-DEHTPLQRLLWQVACFQNQEFALDGAKYIVRRKIKGQIALLNQY 122 Query: 129 -----------------------PARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRY 165 + VE LRGIEG R +++L W Sbjct: 123 KKSKCIPNFVVVHRTMQALLKRVERTKKVETLRGIEGLASRTYFSVLGHVLSEPWEFSGR 182 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADII 223 D +N +S S L A +L AG IG +H+ + S VYD+ DI Sbjct: 183 KR--HPSPDPVNAILSYGYSFLEREVRACLLTAGLDVRIGVLHSTNNRKDSLVYDVMDIF 240 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + D + ++ R+ P+ + L+ R F S + K + L ED + A Sbjct: 241 RQDIIDRFVLKLLNRHMILPE-DFDLSERGCFLSKEANKKWVELYEDYMKAE 291 >UniRef50_C0A724 CRISPR-associated Cas1/Cas4 family protein n=1 Tax=Opitutaceae bacterium TAV2 RepID=C0A724_9BACT Length = 691 Score = 195 bits (495), Expect = 2e-48, Method: Composition-based stats. Identities = 45/353 (12%), Positives = 93/353 (26%), Gaps = 84/353 (23%) Query: 6 LNPIPLKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 + +D +++ + + +K + I + ++ + + +S A Sbjct: 308 RRLVAARDDERELYITTPGTTLLKKSELIQIREKGELVNEIRIKDLSHVAIFGSATISTA 367 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 + A+ + + AG + P + Q + A L++ R + + Sbjct: 368 LLNELAERDIAVSYFSSAGTLRAYTRGPSLKNVFTRIAQFRAADTPATALRIARLFVQGK 427 Query: 124 FGEPAP-------------------------ARRSVEQLRGIEGSRVRATYALLA----- 153 S+E+L GIEGS A + + Sbjct: 428 IRNQRTLIMRNHAMPPASTLGRLQHAITAAANTESIEELLGIEGSAALAYFQEFSGMIKT 487 Query: 154 ---------------------------------------------KQYGVTWNGRRYDPK 168 + + + R P Sbjct: 488 TGDDILDAIAEGREPEPGSLSDTTQAPDITTGKKRRSGKRDTPGQEFFSFDFTRRNRRPP 547 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFD 226 D +N +S A S L +A A G+ P +GF H + + D+ + + Sbjct: 548 R----DAVNALLSLAYSILAKDCTSAAHAVGFDPYVGFYHQPRFGRPALALDLMEEFRPL 603 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + P VR A + S + E + + P Sbjct: 604 VADSVVLTLINTRMISPTDFVR-AGDAVNLSPAGRKQFFNAYEQRMRSMITHP 655 >UniRef50_Q3B3C1 CRISPR-associated protein, Cas1 family n=20 Tax=Bacteria RepID=Q3B3C1_PELLD Length = 343 Score = 195 bits (495), Expect = 2e-48, Method: Composition-based stats. Identities = 45/311 (14%), Positives = 89/311 (28%), Gaps = 46/311 (14%) Query: 11 LKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++ +F+ Q + V+ + +P+ + I+ S + Sbjct: 1 MKKLLNTLFVTTQGAYLSKEGECAVIKIDKVEKVRLPLHMLDGIICFGQITCSPFLMGHC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA 128 A G + ++ E G + P Q + A + + R + G Sbjct: 61 AANGVTVTFLTEYGKFLCQVQGPTKGNILLRRAQYRQADNYLQSAMLARSFVIGKIGNSR 120 Query: 129 P---------------------------------ARRSVEQLRGIEGSRVRATYALLAKQ 155 E++RGIEG R + + + Sbjct: 121 VTLARALRDHPDKIDCEKMHYAQQLLAGCIKKLGNETDQERIRGIEGEAARIYFEVFDQC 180 Query: 156 YG-----VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 +NGR P D +N +S + + +A+ + G PA GF+H Sbjct: 181 ITSSDPLFCFNGRNRRPP----VDRVNCLLSFLYTLVTHDIRSALESCGLDPAAGFLHKD 236 Query: 211 KPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 +P S D+ + + A + R + A + L+ Sbjct: 237 RPGRPSLALDMLEEFRSYIADRMALSLINRGQIHANDFTVSATGAVLMKDDARKTLLTAY 296 Query: 269 EDVLAAGEIQP 279 + P Sbjct: 297 QKRKQDEIEHP 307 >UniRef50_C5BP90 CRISPR-associated protein Cas1 n=3 Tax=Gammaproteobacteria RepID=C5BP90_TERTT Length = 339 Score = 195 bits (495), Expect = 2e-48, Method: Composition-based stats. Identities = 53/309 (17%), Positives = 96/309 (31%), Gaps = 46/309 (14%) Query: 11 LKDRVSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K + +F+ + V+ ++ IP+ ++ I VS + Sbjct: 1 MKKLQNSLFITRQKSYVHKQRETIVVEQESEKILQIPIHAIKSIFCFGNVIVSPFLLGFC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL------ 122 A+ G L + E G + + G + LL + + + L V R + Sbjct: 61 AENGVGLAFFTEYGR--FLARIQGPQSGNVLLRRIQYEKTKSAPLDVARAIIAAKIVSSR 118 Query: 123 ---------------------RFGEPAPARR---SVEQLRGIEGSRVRATYALLAKQ--- 155 R R S+++LRG EG +++ Sbjct: 119 SVLQRHIRNYGSQDDVVKVIGRLKHNLEQARVDPSLDRLRGTEGVAAANYFSVFQHLVRV 178 Query: 156 ---YGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP 212 G +NGR P D IN +S S L AA+ G P +GF+HT + Sbjct: 179 ENDAGFAFNGRNKRPP----TDPINAMLSFLYSVLGNDISAALQGVGLDPQVGFLHTDRS 234 Query: 213 L--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 S D+ + ++ V R + + + S L+ ++ Sbjct: 235 GRDSLAMDLLEELRAWWVDRLVLTQVNRREIKARDFSQAVSGAVTMSESARKALLKAYQE 294 Query: 271 VLAAGEIQP 279 P Sbjct: 295 KKHEEVEHP 303 >UniRef50_A4J500 CRISPR-associated protein Cas1 n=1 Tax=Desulfotomaculum reducens MI-1 RepID=A4J500_DESRM Length = 544 Score = 195 bits (495), Expect = 2e-48, Method: Composition-based stats. Identities = 43/311 (13%), Positives = 92/311 (29%), Gaps = 42/311 (13%) Query: 6 LNPIPLKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 + P P + ++++ Q + ++ IP+ ++ ++L +S Sbjct: 203 VRPQPGINLGRVLYVDEQGASLYKKGERVLVTKDQIKFKDIPLCNLDQVVLVGNVNLSSQ 262 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR 123 ++L GT + ++ G S + Q + ++ RL + + Sbjct: 263 LIKLFLGRGTEVHFISTKGKYYGCLQAALSKNSVLRIAQHRAYQKQEERLLYASEFVRGK 322 Query: 124 FGEP-----------------------------APARRSVEQLRGIEGSRVRATYALLA- 153 + + +L G+EG+ R +++ Sbjct: 323 LSNMRTNLLRYNRSLNNHSIDEAVSRIKNIIKRLEKAKDLNELMGLEGAGSRDYFSVFGL 382 Query: 154 ---KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG 210 + +N R P D N +S + S L A+ G+ P IGF+H Sbjct: 383 LIKDRVPFDFNKRSRRPP----EDPANALLSFSYSLLLKDVITAVQVVGFDPFIGFLHRS 438 Query: 211 K--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 + DI + + + + + F S K+ L Sbjct: 439 DFGRPALALDIIEEFRPVVADSVVLTALNKGVIA-EGDFEYRMGGCFLSETGRKKMYRLY 497 Query: 269 EDVLAAGEIQP 279 E+ P Sbjct: 498 EERRKEMITHP 508 >UniRef50_A2SRR7 CRISPR-associated protein, Cas1 family n=24 Tax=cellular organisms RepID=A2SRR7_METLZ Length = 344 Score = 194 bits (494), Expect = 3e-48, Method: Composition-based stats. Identities = 36/311 (11%), Positives = 90/311 (28%), Gaps = 47/311 (15%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ +++++ + + V+ + +P+ ++ ++ S + L Sbjct: 1 MRKLRNVLYVTNPKSYLSRDGESIVVSVENQELARVPIRNLEGVVCFGYMGASPGMMALC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----------K 118 + + +V G + G Q ++ DE+ K+ + Sbjct: 61 TENDVGMCFVSPYGKFMARIGGNVSGNVLLRKRQYAVSDDEEASKKIAAYCILGKLMNCR 120 Query: 119 MFELRFGEPAPARRSVE------------------------QLRGIEGSRVRATYALLAK 154 RF P S E +LRG EG + + Sbjct: 121 TVLQRFSRDYPDMVSREFEQNFKRLSEGILQIRAGTCGSLNELRGFEGILSKYYFHSFND 180 Query: 155 QY-----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 ++ R P D +N +S + + + +A+ + G P +GF+H Sbjct: 181 LILSTEPEFSFENRSRRPP----LDRVNALLSFSYTLIAADCASALESVGLDPQVGFLHR 236 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 +P S D+ + + + + D + + ++ Sbjct: 237 VRPGRPSLALDLMEEFRPYLGDRFVLSLINNRVVKADDFAVKENGAVLLTDDARKTVLQA 296 Query: 268 IEDVLAAGEIQ 278 + + Sbjct: 297 WQKRKKEEVMH 307 >UniRef50_C7G6C1 CRISPR-associated protein Cas1 n=3 Tax=Firmicutes RepID=C7G6C1_9FIRM Length = 334 Score = 194 bits (494), Expect = 3e-48, Method: Composition-based stats. Identities = 46/319 (14%), Positives = 101/319 (31%), Gaps = 40/319 (12%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ Q I + F + K G+ IP ++ I + +++ + + G Sbjct: 1 MSYLYVSEQGASIGIEANRFQVNYKDGMIKSIPAETLEMIEVFGSVQITTRCLTECLKRG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP--- 129 +++ +G + QA++ +E +L++ +++ + + Sbjct: 61 VNILFYSTSGAYYGRLISTSHVNVQRQRIQAEIGHNETFKLEMSKRIIDAKIRNQVVVLR 120 Query: 130 -------------------------ARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNG 162 +SVEQ+ G EG+ + + +L K + G Sbjct: 121 RYARGRDEDIHRMIIEMQNMQKKLLYAKSVEQVMGYEGTAAKIYFKVLGKLIDEQFVFEG 180 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIA 220 R P D N IS S + I G P G +H + + D+ Sbjct: 181 RSRRPP----MDPFNSLISLGYSIILNELYGKIEGKGLNPYFGVMHKDREKHPTLASDLM 236 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDRE-VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + + + A + + + + +F K I +E Sbjct: 237 EEWRAVLIDTTALSMLNGHELVKEDFYTGIDQPGVFLEKDGFRKYIQKLEGKFRTENRYL 296 Query: 280 PAPPED-AQPVAIPLPVSL 297 + A+ L V+ Sbjct: 297 SYIDYSVSFRRAMDLQVNQ 315 >UniRef50_C4FMU2 Putative uncharacterized protein n=1 Tax=Veillonella dispar ATCC 17748 RepID=C4FMU2_9FIRM Length = 331 Score = 193 bits (491), Expect = 6e-48, Method: Composition-based stats. Identities = 41/298 (13%), Positives = 91/298 (30%), Gaps = 41/298 (13%) Query: 15 VSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ I G V+ + +P+ + I + ++ + V + G Sbjct: 1 MSSLYVTEAGSFIKRDGGHVVVGRNNEVLFEVPLERIEDITVFDSVSITSSLVTDFIERG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------L 122 + W+ G +K Q L D R+ + RK+ Sbjct: 61 VPITWLSGYGKYFGTLINTNTIDINKHKKQFDLLDDNAFRVAMSRKIIRAKVRNQLTILR 120 Query: 123 RFGEPAPARRSVE--------------------QLRGIEGSRVRATYALLAKQYG--VTW 160 R+ +++ +L G EG R + L K + Sbjct: 121 RYARNLEEDINIDAQIANIKSVRSHIGECMRVSELMGYEGLISRLYFEALGKIVPSAFAF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYD 218 R P D N + S L+ A ++ AG P +G +H+ + D Sbjct: 181 TKRTKQPPR----DPFNAMLGLGYSMLFNEILAGVINAGLHPFVGIMHSLAKGHPALASD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 + + + + + RN + E + + + +++ + + + Sbjct: 237 LIEEWRAPIIDSMVLSMVSRNMVDL-AEFDNSDKGCYLTAEGRKVFLTAYNKKIRSEN 293 >UniRef50_Q3J7J6 CRISPR-associated protein, Cas1 family n=2 Tax=Nitrosococcus oceani RepID=Q3J7J6_NITOC Length = 295 Score = 193 bits (491), Expect = 6e-48, Method: Composition-based stats. Identities = 71/295 (24%), Positives = 114/295 (38%), Gaps = 35/295 (11%) Query: 8 PIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTG---IRTHIPVGSVACIMLEPGTRVSHAA 64 PI R + +L++ ++ D V + G IP + I+L GT ++ AA Sbjct: 2 PILPSHRQGLYYLEHCRVMAKDERVVYACQEGAFTKFFAIPPANTNVILLGSGTSLTQAA 61 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR- 123 RL A ++ +VG G ++ + Q ++ +L D D RLKV + R Sbjct: 62 ARLLASEQVMVAFVGGGGSPLFLASQNEYRPTEYCQAWMRLWQDNDQRLKVAKTFQRNRA 121 Query: 124 ---------FGEPAPARRSVEQL-----------------RGIEGSRVRATYALLAKQYG 157 EP P + S+E+L E + Y A Sbjct: 122 EFLMQQWPKLAEPKPHKASLEKLAERYLADIELAGDNGTILAQEAKFAKKLYKFWANCTE 181 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH-TGKPLSFV 216 DP + D N + +YG+ A + G ++ +H T + + V Sbjct: 182 T--ENFTRDPGKRDFNDPFNSYLDHGNYLVYGIAAAVLWVLGIPHSLPVIHGTTRRGALV 239 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 +D+ADIIK V+P AF+ A G D+E+R AC S + L I+ V Sbjct: 240 FDVADIIKDTCVMPIAFQHAA--AGRSDQEMRQACIAWLDESHAMTFLFQSIKRV 292 >UniRef50_B2RM83 CRISPR-associated protein Cas1 n=3 Tax=Bacteroidetes RepID=B2RM83_PORG3 Length = 338 Score = 192 bits (489), Expect = 9e-48, Method: Composition-based stats. Identities = 47/307 (15%), Positives = 87/307 (28%), Gaps = 49/307 (15%) Query: 14 RVSMIFLQYGQIDVIDGAFVL---------IDKTGIRTHIPVGSVACIMLEPGTRVSHAA 64 + + G++ D ++ G +IPV ++ + R + + Sbjct: 2 KKTYYLFNPGELSRKDNTIRFVPIQEGENGQEQAGQARYIPVEGISDFYVFGSLRANSSL 61 Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGG---ARSDKLLYQAKLALDEDLRLKVVRKM-- 119 + + Y LL QA ++ RL + RK Sbjct: 62 YNFLGSNDIAVHFFDYY--ENYTGSFMPRDFLLSGKMLLAQASAYKNKKKRLFLARKFIE 119 Query: 120 ------------FELRFGEP-------------APARRSVEQLRGIEGSRVRATYALLAK 154 + R + ++E L GIEG+ +A Y Sbjct: 120 GAASNMQKNLAYYNNRGKDMQPMMELIDKYSLRLEETTTIEALMGIEGNIRQAYYDAFNL 179 Query: 155 QY-GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--K 211 R P E +N IS Y + AI + P I F+HT + Sbjct: 180 IIDPFEMGARSKQPPQNE----VNALISFGNMMCYTLCLKAIHQSQLNPTISFLHTPGER 235 Query: 212 PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 S DI+++ K V F++ + + + + + E+ Sbjct: 236 RYSLCLDISEVFKPILVDRTIFKVMNKRIIQA-KHFDKQLNKCILNPSGKKLFVQAFEER 294 Query: 272 LAAGEIQ 278 L+ Sbjct: 295 LSETIRH 301 >UniRef50_Q1Q3I6 Putative uncharacterized protein n=1 Tax=Candidatus Kuenenia stuttgartiensis RepID=Q1Q3I6_9BACT Length = 328 Score = 192 bits (488), Expect = 1e-47, Method: Composition-based stats. Identities = 34/294 (11%), Positives = 85/294 (28%), Gaps = 38/294 (12%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGE 80 Q + ++ + + + +++ + + AV + G + + Sbjct: 4 QNSILRKSGDRLIIEKDDKVLLEVQCHKIDAVLIFGNVQFTTQAVHELFEHGIEMAILTR 63 Query: 81 AGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG--------------- 125 G + P + Q K ++D RL + + + Sbjct: 64 TGKLIGQITSPYTKNITLRVQQFKQYWNDDFRLAFAKVIVCGKIQNCIQLVRSFSYNHPR 123 Query: 126 --------------EPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWNGRRYDPKD 169 + ++ QL GIEG+ R + K + GR+ P Sbjct: 124 NSFDVEMDDLSLRLNEVESAANISQLFGIEGNAARVYFTSFGKMILSAFAFPGRKKYP-- 181 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDT 227 D +N +S + ++ + + G+ P +G+ H S D+ + + Sbjct: 182 --STDPVNALLSLNYTMIFNEISSLLDGLGFDPYLGYYHGIDYGRSSLASDLMEEFRAPI 239 Query: 228 VVPKAFEIARRNPGEPDRE-VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 + + + ++ + L + E ++ + P Sbjct: 240 ADRITLNLINNRIFCEEDFYANPSTGGVYLKREPLKRYFVEYETMINREFLHPQ 293 >UniRef50_UPI000197AF65 hypothetical protein BACCOPRO_02409 n=1 Tax=Bacteroides coprophilus DSM 18228 RepID=UPI000197AF65 Length = 340 Score = 192 bits (488), Expect = 1e-47, Method: Composition-based stats. Identities = 48/308 (15%), Positives = 93/308 (30%), Gaps = 43/308 (13%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ ++ +++ I V+ IP ++ I+ S ++L Sbjct: 1 MRKLLNTLYVTTPNAYISKDGLNIVVSVNQEEVFRIPAINIESIVTFGYMGASPGVMKLC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE-- 126 + G L ++ G + Q +L DE L V M + Sbjct: 61 SDSGISLTFLSPHGKFISRVQSATKGNVLLRKKQYQLVDDEAWSLHVSLLMIGGKIQNYR 120 Query: 127 ---------------PAPARRSVE-------------QLRGIEGSRVRATYALLA----- 153 A +++E L G EG A + +L Sbjct: 121 NILRRYIRDYGENENVNQAIQTLERAKRNALQAPDKTTLIGYEGLASNAYFEVLPVLILN 180 Query: 154 KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL 213 ++ + GR P D +N +S A + + AA+ G P +GF+HT +P Sbjct: 181 QKADFPFQGRNRRPPK----DAVNAMLSFAYTLIANDVAAALETIGLDPYVGFLHTLRPG 236 Query: 214 --SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV 271 S D+ + ++ + + + + S K I ++ Sbjct: 237 RTSLALDMMEELRAYLGDRFVLSLINKRQISSKEFLYQGDNGVVMSDKGKKIFISAWQNR 296 Query: 272 LAAGEIQP 279 I P Sbjct: 297 KRETIIHP 304 >UniRef50_C1XN81 CRISPR-associated protein Cas1 n=2 Tax=Meiothermus RepID=C1XN81_MEIRU Length = 323 Score = 192 bits (487), Expect = 2e-47, Method: Composition-based stats. Identities = 53/300 (17%), Positives = 99/300 (33%), Gaps = 36/300 (12%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 + Q + + G + +P V +++ R++ A+ + G +++ Sbjct: 5 LTEQSSTLRLSQGRLRVELDEQTLAELPARKVRGVVVWGNVRLTTPALAFLLRQGVPVLY 64 Query: 78 VGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR----------KMFELRFG-- 125 G + P + L Q + L + + +M R Sbjct: 65 ATLEGQLYGQAQAPQSLAPEVLRAQLLAQQNP---LPLAQGFLLGKLRSGQMLLERLARQ 121 Query: 126 --------------EPAPARRSVEQLRGIEGSRVRATYALL-AKQYGVTWNGRRYDPKDW 170 E P RS+E LRGIEG+ RA +A L A ++GR P Sbjct: 122 APITPQQAEIEAALEALPQARSLEALRGIEGNAARAYFAGLQAVLAPYGFSGRNRRPP-- 179 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTV 228 D +N +S L G A+ AG P +G +HT + +D+ + + V Sbjct: 180 --TDAVNAALSYGYMVLLGRVLLALGIAGLHPELGLLHTEGRRVPALAFDLMEEFRVSVV 237 Query: 229 VPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQP 288 R+ P + ++ + L+ +E+ + P + Q Sbjct: 238 DAVVIAAFLRSELTPQQHSEARNGGVYLNEAGRKALLRRLEERFSQEAQHPKGFRKPYQE 297 >UniRef50_B4W4R1 CRISPR-associated protein Cas1 n=1 Tax=Microcoleus chthonoplastes PCC 7420 RepID=B4W4R1_9CYAN Length = 354 Score = 191 bits (486), Expect = 2e-47, Method: Composition-based stats. Identities = 41/305 (13%), Positives = 101/305 (33%), Gaps = 40/305 (13%) Query: 11 LKDRVSMIFLQYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAA 69 ++ ++ ++ G I F++ + +P+ V I++ ++S +++ Sbjct: 18 HREMAAIYLIEQGTTIYKEYQRFIIYVSEKPKLEVPIREVQQILVFGNIQLSTPVMQVCL 77 Query: 70 QVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF--------- 120 + +V++ ++G D+ L Q + D + +V + + Sbjct: 78 REQIAVVFLSQSGRYHGHLWSSEFRDLDQELVQVRRWGDAAFQFQVSQAIVYGKLMNSKQ 137 Query: 121 -------ELRFG-------------EPAPARRSVEQLRGIEGSRVRATYALLAKQY---G 157 + + E S+++LRG EG + L + Sbjct: 138 LLLRFNRKRKLPDVERAIIGINQDIEALEFSESLDRLRGYEGIGAARYFPALGQLITNSR 197 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--F 215 ++ R P D +N +S + L+ I+A G +P +G H G+ Sbjct: 198 FEFSLRNRQPP----TDPVNSLLSFGYTLLFNNVLGFIIAEGLSPYLGNFHYGERQKPYL 253 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC-RDIFRSSKTLAKLIPLIEDVLAA 274 +D+ + ++ V I + +P + ++ + + E + Sbjct: 254 AFDLMEEMRSVVVDSLVLNIVNHSLFKPQDFDTVPSTGGVYLNQSARRVFLKQFETRMNE 313 Query: 275 GEIQP 279 P Sbjct: 314 EVSHP 318 >UniRef50_Q8F1F5 Putative uncharacterized protein n=2 Tax=Leptospira interrogans RepID=Q8F1F5_LEPIN Length = 550 Score = 191 bits (485), Expect = 3e-47, Method: Composition-based stats. Identities = 56/322 (17%), Positives = 99/322 (30%), Gaps = 51/322 (15%) Query: 3 WLPLNPIPLKDRVSMIFL--QYGQIDVIDGAFVLID-----KTGIRTHIPVGSVACIMLE 55 + P+ P K + + + +I D ++ + IP+ + + + Sbjct: 199 YEPIRLFPEKREKTTLHVFGHDSRIKKSDNVLLVEKVTETGEKSKSEKIPIQEIESVNIH 258 Query: 56 PGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKV 115 ++S ++ + W G + + + Q K E +RL + Sbjct: 259 GNCQISSQMIKFLVSEEIPVHWFSGGGNYIGGININPSG-VQRRIRQFKALTKETIRLNL 317 Query: 116 VRKMFELRFGEP-------------------------------APARRSVEQLRGIEGSR 144 +K+ + + S QL GIEGS Sbjct: 318 AKKLVSAKCESQLRYLLRATRGKDETRNETESYLATIRSGLKNIESADSPSQLLGIEGSS 377 Query: 145 VRATYALLAKQYG-----VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAG 199 RA ++ L + NGR P D N +S S LY AI+A G Sbjct: 378 ARAYFSGLPALLKNSDPFLVPNGRSKRPPK----DPFNATLSFLYSLLYKSVRQAIIAVG 433 Query: 200 YAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 P+ GF HT + + V D+ ++ + R D + + ++ S Sbjct: 434 LDPSFGFYHTPRSSAEPLVLDLMELFRVSLCDMTLIGSINRKSW-IDEDFEITKNKVWLS 492 Query: 258 SKTLAKLIPLIEDVLAAGEIQP 279 K L E L P Sbjct: 493 ESGRKKATQLYETRLDDTWKHP 514 >UniRef50_B9LWK7 CRISPR-associated protein Cas1 n=4 Tax=Halobacteriaceae RepID=B9LWK7_HALLT Length = 331 Score = 191 bits (485), Expect = 3e-47, Method: Composition-based stats. Identities = 54/314 (17%), Positives = 94/314 (29%), Gaps = 41/314 (13%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 + G++ + + G H+PV S+ + L + A+ L G Sbjct: 5 NHHVFTDGELSRSEDTLRIDTLDGEVEHLPVESIDTLYLHGQIDFNTRALGLLNDHGVPA 64 Query: 76 VWVGEAGVRVYASGQPGGA---RSDKLLYQAKLALDEDLRLKVVRKMFELRFGE------ 126 G Y + ++ Q + D D RL + M E Sbjct: 65 HVFGWKD--YYKGSYLPKRSHLSGNTVVEQVRAYDDPDRRLGIATLMIEASIHNMRANLV 122 Query: 127 ---------------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGR 163 A S++ LRG E + + Y+ ++ + + R Sbjct: 123 YYNARDCSFDSEIDRLESLKTKASTAESIDGLRGTEATARKTYYSCFSEILRDPFALDRR 182 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIAD 221 Y+P + N IS + +Y +AI P +GF+H + + DIAD Sbjct: 183 EYNPP----TNETNALISFLNAMVYTACVSAIRKTALDPTVGFMHEPGDRRFTLSLDIAD 238 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 I K F + R PD E + ++ E+ L P Sbjct: 239 IYKPILADRVLFRLVNRRQISPD-EFESDLDGCLLTEDGRLTVLAEYEETLDKTVEHPRL 297 Query: 282 PPEDAQPVAIPLPV 295 + + V Sbjct: 298 KRNVSYKTLVQTDV 311 >UniRef50_D0KYZ2 CRISPR-associated protein Cas1 n=3 Tax=Bacteria RepID=D0KYZ2_HALNC Length = 344 Score = 190 bits (482), Expect = 6e-47, Method: Composition-based stats. Identities = 46/311 (14%), Positives = 91/311 (29%), Gaps = 45/311 (14%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I + + + + + + +P+ + ++ +S A + A G LV Sbjct: 9 YIMTPNAYVHLENATVRIDVEREKKLQVPLHHLNGLVCFGNIMISPALMHRLADDGKSLV 68 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE---------- 126 + +G P + A D L++ R + + Sbjct: 69 LMDSSGRFKARLEGPVSGNILLRQAHHRQASDAAFALEIARTIVSGKLKNSRSVVQRGAR 128 Query: 127 -----------------------PAPARRSVEQLRGIEGSRVRATYALL------AKQYG 157 A A S+++LRGIEG R ++ + A + Sbjct: 129 ETSDTIETTQLTRSADNLAASLRAAAAATSMDELRGIEGEAARGYFSAINLIVKTAMRAN 188 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 NGR P D N IS + L +AI A G +GF+H +P + Sbjct: 189 FQLNGRTRRPP----LDRFNALISFLYAMLMNDCRSAIEATGLDAQLGFLHAVRPGRAAL 244 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 D+ + + A + R + + + ++ ++ Sbjct: 245 ALDLMEEFRAIAADRLALTLINRGQINASDFDEREGGAVMLNDRGRRAVVTAWQERKQEI 304 Query: 276 EIQPPAPPEDA 286 P + Sbjct: 305 VTHPLTETKIP 315 >UniRef50_O66692 Putative uncharacterized protein n=2 Tax=Aquificaceae RepID=O66692_AQUAE Length = 316 Score = 189 bits (481), Expect = 9e-47, Method: Composition-based stats. Identities = 44/296 (14%), Positives = 87/296 (29%), Gaps = 24/296 (8%) Query: 17 MIFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 + ++ +G + + + ++ IPV V I + ++ + A G L Sbjct: 4 VYYINSHGTLSRHENTLRFENA-EVKKDIPVEDVEEIFVFAELSLNTKLLNFLASKGIPL 62 Query: 76 VWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKM---------FELRFG 125 + G + L+ Q + LD RL + + + + Sbjct: 63 HFFNYYGYYTGTFYPRESSVSGHLLIKQVEHYLDAQKRLYLAKSFVIGSILNLEYVYKIS 122 Query: 126 -----EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCI 180 S+ +L +E + Y L + G R P + +N I Sbjct: 123 ADTYLNKVKETNSIPELMSVEAEFRKLCYKKLEEVTGWELEKRTKRPPQ----NPLNALI 178 Query: 181 SAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARR 238 S S Y I P + ++H K S D+A++ K V + + Sbjct: 179 SFGNSLTYAKVLGEIYKTQLNPTVSYLHEPSTKRFSLSLDVAEVFKPIFVDNLIIRLIQE 238 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLP 294 N + F + + ++L P + + I L Sbjct: 239 NKIDKT-HFSTELNMTFLNEIGRKVFLKAFNELLETTIFYPKLNRKVSHRTLIKLE 293 >UniRef50_B8DWG2 CRISPR-associated protein Cas1 n=4 Tax=Bifidobacterium animalis subsp. lactis RepID=B8DWG2_BIFA0 Length = 535 Score = 189 bits (481), Expect = 9e-47, Method: Composition-based stats. Identities = 39/301 (12%), Positives = 82/301 (27%), Gaps = 38/301 (12%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + + G + T +P+ S+ + + VS +R ++W G Sbjct: 219 ARAYLKSGRMHVSKNGDEITSVPLDSIQALQIHGNVDVSSGLMRELMWRNIPILWCSGTG 278 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA-------------- 128 + S G + + Q + RL + R+ + Sbjct: 279 RLMGWSVSSYGPNGETRVAQ--HVASHEGRLDLAREFISAKIHNQIVLLRRSDKNNNVLF 336 Query: 129 ---------PARRSVEQLRGIEGSRVRATYALLAKQYGV------TWNGRRYDPKDWEKG 173 ++ + +EG ++ V W R P Sbjct: 337 DMKHIEKSVVNANRIQDILSLEGQAAALYFSQFHHLISVNKRNEWPWLERMRHP----AP 392 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPK 231 D +N + S L AI++ G GF+H+ K + D+ + + Sbjct: 393 DPLNALLDYTYSLLLSDCIRAIVSCGLDAHAGFLHSSKRNKPALALDLMEEFRAPIADSV 452 Query: 232 AFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAI 291 + + + + + +T LI E +A P + + Sbjct: 453 VQTVVNNGEIKRNGFANV-MGSVRLRDETRKTLIGAYERRMATELKHPVYAYRASWRRIV 511 Query: 292 P 292 Sbjct: 512 E 512 >UniRef50_Q2NH78 Putative uncharacterized protein n=1 Tax=Methanosphaera stadtmanae DSM 3091 RepID=Q2NH78_METST Length = 332 Score = 189 bits (480), Expect = 1e-46, Method: Composition-based stats. Identities = 48/308 (15%), Positives = 100/308 (32%), Gaps = 40/308 (12%) Query: 18 IFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 +++ G + + + T IP+ ++ I + A+ L + ++ Sbjct: 13 LYINSNGILYRKENTLKF-KNKEVDTSIPIHAINEINCYGKVSLRSGAISLLQKEKIIIN 71 Query: 77 WVGEAGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVVRKM-------------- 119 + + G Y ++ QA +E+ R + ++M Sbjct: 72 FFNKYG--YYEGSLYPKIALNSGVIVVKQALTYNNENKRCFIAKEMVNGMKHNMIKTLKY 129 Query: 120 FELRFGEPAPARRSVEQ----------LRGIEGSRVRATYALLAKQYG-VTWNGRRYDPK 168 ++ + + +++E + EG ++ Y R + P Sbjct: 130 YKKKGKDVDEHIQNLENESILDGNINRILSSEGKLWQSYYPSFDNITKKFPIEKREFKPP 189 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFD 226 E +N IS S LY T + I P+I F+H + S D+ADI K Sbjct: 190 KNE----MNSLISYGNSLLYTTTLSEIYHTYLHPSISFLHEPRERRFSLACDLADIFKPL 245 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDA 286 + F++ N ++ + ++ + K K I + L P + + Sbjct: 246 IISRTIFKLVNTNIIN-EKHFKKDV-GVYLNEKGRQKFIQEYNNKLKTTIKHPQIKKKVS 303 Query: 287 QPVAIPLP 294 I L Sbjct: 304 YRYLIRLE 311 >UniRef50_B0JKW9 CRISPR-associated protein Cas1 n=6 Tax=Cyanobacteria RepID=B0JKW9_MICAN Length = 334 Score = 189 bits (480), Expect = 1e-46, Method: Composition-based stats. Identities = 43/324 (13%), Positives = 91/324 (28%), Gaps = 45/324 (13%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + +++ + D + +P+ +V +++ VS A V + Sbjct: 1 MGTVYVTREDAFLGKNDERLTVKASKETILDVPLLNVEGVVIFGRGSVSPALVIELLERH 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE------ 126 L +V G + Q + A D + +VR + Sbjct: 61 IPLSFVTATGKYLGRLEPEMTKNIFVRRAQWQAAGDSPKAIHLVRGFVRGKLKNYRNILL 120 Query: 127 -----------------------PAPARRSVEQLRGIEGSRVRATYALLAKQY---GVTW 160 +++ LRG+EG+ A ++ T+ Sbjct: 121 RRQRDRKELDLQTAIACLEAAIGSIETTSAIDSLRGLEGAGSAAYFSCFDSLILGDTFTF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYD 218 R P D +N +S + L ++AI G+ P +G++H + S D Sbjct: 181 ASRNRRPPR----DPVNSLLSLGYALLRHDVQSAINLVGFDPYLGYLHYQRYGRPSLALD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIF-RSSKTLAKLIPLIED----VLA 273 + + + V P+ + + L + L E Sbjct: 237 LMEEFRAIVVDAVVLNGVNHPYLTPEHFTTEPLSGAVSLTREGLQIFLRLYEQKKQSKFK 296 Query: 274 AGEIQPPAPPEDAQPVAIPLPVSL 297 + ++A + L Sbjct: 297 HPVMGKQCTYQEAFEIQARLMAKY 320 >UniRef50_Q2FQQ2 CRISPR-associated protein Cas1 n=2 Tax=Methanomicrobia RepID=Q2FQQ2_METHJ Length = 347 Score = 189 bits (479), Expect = 1e-46, Method: Composition-based stats. Identities = 44/301 (14%), Positives = 93/301 (30%), Gaps = 28/301 (9%) Query: 10 PLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRT-HIPVGSVACIMLEPGTRVSHAAVRLA 68 L+D I YG+I + G + D G P+ V + + VS ++ Sbjct: 25 LLEDDTIYITTPYGKISLDGGRIQVKDSDGEIVASFPLEKVCTMNVFGSASVSTPLLKHC 84 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA 128 + ++ + G Y + L+ + + + ++ L + R++ + Sbjct: 85 SDKEVVINYFTNFGK--YFGSFVPSRNTIALVRRHQAGITKEKSLAICREIIHAKLQNSC 142 Query: 129 P---------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDP 167 SV+ LRGIEG + +L+ W R Sbjct: 143 VFLARKRVEVPSLLKELRDRSLHAVSVDSLRGIEGEAASIYFPMLSSSLPDEW--RSDKR 200 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKF 225 D +N +S + + +A+ P IG +H + + D+ + + Sbjct: 201 TRRPPRDELNALLSLTYTMVNTEVISALRQYNLDPFIGVMHVDRHGKPALALDLLEEFRP 260 Query: 226 DTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPED 285 + + D + + + K L +E+ L + + Sbjct: 261 VFCDAFVARLINKRMITKDDFTQGSRLNDTAFKKYLGFYHDFMEESLKHPRFKYSVSRKK 320 Query: 286 A 286 A Sbjct: 321 A 321 >UniRef50_B7KM77 CRISPR-associated protein Cas1 n=1 Tax=Cyanothece sp. PCC 7424 RepID=B7KM77_CYAP7 Length = 334 Score = 189 bits (479), Expect = 2e-46, Method: Composition-based stats. Identities = 39/302 (12%), Positives = 99/302 (32%), Gaps = 41/302 (13%) Query: 15 VSMIFL-QYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ ++L + G + ++ + + + V IM+ ++S A+ + Sbjct: 1 MATLYLMEQGTWVQKEQERLIIQVSKTQKMEVLMREVERIMIFGNVQLSTPAINACLKHN 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE----------L 122 L++++ +AG G + + Q K + + ++K+ + + + Sbjct: 61 ILVLFLNQAGQYNGHLWSLGSIHLNNEMVQIKRHQEHEFQVKISKAIVYGKLMNSKRLLM 120 Query: 123 RFGEP-------------------APARRSVEQLRGIEGSRVRATYALLAKQY---GVTW 160 R + +++QLRG EG + + ++ Sbjct: 121 RLNRKRQVPDMDKVIEGINSDILSLESVDNLDQLRGYEGIAAARYFPAFGQLITNAAFSF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYD 218 + R P D +N +S + L+ + I++ G +P G H G+ +D Sbjct: 181 SLRNRQPP----TDPVNSLLSFGYTLLFNNVLSLIISEGLSPYFGNFHYGERDKPYLAFD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACR-DIFRSSKTLAKLIPLIEDVLAAGEI 277 + + + V + +A ++ + K + E + Sbjct: 237 LMEEFRAIIVDGMVLRVINNGLLTLKDFEPVASNGGVYLTDKGRRIFLKEFESRINKLIS 296 Query: 278 QP 279 P Sbjct: 297 HP 298 >UniRef50_Q8YWX6 Alr1468 protein n=4 Tax=Cyanobacteria RepID=Q8YWX6_ANASP Length = 668 Score = 187 bits (476), Expect = 3e-46, Method: Composition-based stats. Identities = 45/301 (14%), Positives = 98/301 (32%), Gaps = 40/301 (13%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ +++ Q + V + F + + +R +PV V+ ++L VSH AV +A + Sbjct: 335 MTTLYITDQGAYLSVKNQQFQVFHQGELRIKVPVMRVSNVVLFGCCNVSHGAVSMALRRR 394 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------- 125 ++++ + G G A+ + L+ Q + + + K + + Sbjct: 395 IPIMYLSQKGRYFGRLQTEGDAKVEYLMLQVERCQNHEFTRKQAEAIVRAKLHNSRALLL 454 Query: 126 ----------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQY--GVTWN 161 E S++ LRG EG + L + + Sbjct: 455 KLNRRHPSKIAATAISGIAELMEKLSLAESMDSLRGYEGKAATLYFQGLGSLFTGAFVFE 514 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDI 219 R P D +N +S + L + + G G +H + + V D+ Sbjct: 515 KRTKRPP----TDPVNSLLSLGYTLLSQNVFSFVQVIGLHTHFGNLHVPRDNHPALVSDL 570 Query: 220 ADIIKFDTVVPKAFEIARRNPGEPDREVRLACR-DIFRSSKTLAKLIPLIEDVLAAGEIQ 278 + + V + + + ++ L K + E+ L + Sbjct: 571 MEEFRAQLVDSLVAYLINSKIFTFEDFTPPDEKGGVYLQPHALKKFLKHWEEKLQSEVTH 630 Query: 279 P 279 P Sbjct: 631 P 631 >UniRef50_B7GYY4 CRISPR-associated protein Cas1 n=8 Tax=Proteobacteria RepID=B7GYY4_ACIB3 Length = 321 Score = 187 bits (476), Expect = 3e-46, Method: Composition-based stats. Identities = 57/308 (18%), Positives = 114/308 (37%), Gaps = 51/308 (16%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDK---TGIRTHIPVGSVACIMLEPGTRVS 61 L I R ++ +L++ ++ DG + + + +IP+ + I+L GT ++ Sbjct: 8 DLKAILHSKRANLYYLEHCRVMQKDGRVLYLTEAKNENQYWNIPIANTTVILLGTGTSIT 67 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-------GQPGGARSDKLLYQAKLALDEDLRLK 114 AA+R+ G L+ + G G ++A Q ++ + DE RL Sbjct: 68 QAAMRMLCSAGVLVGFCGGGGTPLFAGSEVEWLTPQSEYRPTEYMQGWMSFWFDETKRLD 127 Query: 115 VVRKMFELR--------------------------------FGEPAPARRSVEQLRGIEG 142 V + R F + P V L E Sbjct: 128 VAKAFQFARIEFIRKIWAKDKDLKDEGFYLDNLDIQQALNGFEKKIPNMTKVGDLLLAEA 187 Query: 143 SRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAP 202 + Y + A + +++ ++ E+GD N ++ YG++ + G + Sbjct: 188 QTTKQLYKIAATRCKLSFE------RNPEQGDLANDFLNHGNYLAYGLSATTLWVLGISH 241 Query: 203 AIGFVH-TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTL 261 + +H + + V+D+AD+IK V+P AF A+ ++E R F + L Sbjct: 242 SFAVMHGKTRRGALVFDVADLIKDAVVLPWAFICAKEGAT--EQEFRQQLLQKFTDYRCL 299 Query: 262 AKLIPLIE 269 + ++ Sbjct: 300 DWMFDQVK 307 >UniRef50_Q1J1U7 CRISPR-associated protein Cas1 n=1 Tax=Deinococcus geothermalis DSM 11300 RepID=Q1J1U7_DEIGD Length = 342 Score = 187 bits (476), Expect = 3e-46, Method: Composition-based stats. Identities = 39/316 (12%), Positives = 97/316 (30%), Gaps = 43/316 (13%) Query: 11 LKDRVSMIFLQ--YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ ++ +++Q + + + + + +P+ + +++ +S + Sbjct: 1 MRQLLNTLYIQTQGTYLHLDTDNIRVEVERTKKAMLPLHHIEGVVVFGNVLLSPFLIHRL 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP- 127 A+ + W+ E G + + P Q A + L + R + + Sbjct: 61 AREHKPVTWLSEHGRFMARTETPMSGNVLLRTAQHACAGNAARTLAIARLIAAGKLQNQK 120 Query: 128 -------------------------------APARRSVEQLRGIEGSRVRATYALLAKQY 156 P +V+++RG EG+ R + + Sbjct: 121 VTLLRAAREAEADDAALLRQAARDINVQIACLPLTETVDEVRGTEGTAARLYWEVFPLML 180 Query: 157 G----VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP 212 W R+ D IN ++ + L +A A G P +GF+H +P Sbjct: 181 RQNRDFFWLSERHRRP---ARDPINALLNFVYTVLANDCASACQAVGLDPQLGFLHALRP 237 Query: 213 L--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 S D+ + ++ + R P V + + + ++ + + Sbjct: 238 GRSSLALDLMEELRPVIADRAILTLINRQQLTPRDFVLHEGGTVSITEEGRKTILAHLAE 297 Query: 271 VLAAGEIQPPAPPEDA 286 + P + Sbjct: 298 RRREEVMHPLTARKTP 313 >UniRef50_B2SPB2 Crispr-associated protein Cas1 n=56 Tax=Bacteria RepID=B2SPB2_XANOP Length = 344 Score = 186 bits (473), Expect = 7e-46, Method: Composition-based stats. Identities = 40/304 (13%), Positives = 93/304 (30%), Gaps = 47/304 (15%) Query: 11 LKDRVSMIF--LQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ +++ ++ + V+ + R +PV + ++ VS + Sbjct: 1 MRRQLNTLYATTDGAWLRKDGANIVMEVERQERARLPVHMLESLVCIGRVAVSPQLLGFC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE------- 121 ++ G + ++ G + P Q + + D +VR + Sbjct: 61 SEHGISICYLTPQGRFLARVEGPVSGNVLLRRAQYRRSDDPAGCAAIVRHLLAGKIHNQR 120 Query: 122 ---------------------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAK 154 R + V+ LRG+EG ++ + + + Sbjct: 121 AVLARGWRDHGDCLTDVAAFQHSLKRLKRIPQRVLVETDVDVLRGLEGEAAQSYFGVFGQ 180 Query: 155 QYG-----VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 + + GR P D N +S + L +A+ + G PA+GF+H Sbjct: 181 LVRADKPLLRFGGRNRRPPR----DAFNALLSFLYTLLTHDCRSALESVGLDPAVGFLHR 236 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 +P S D+A+ + A + R + + ++ Sbjct: 237 DRPGRPSLALDLAEEFRPLLGERLALSLINRRQLNERDFQVFDNGAVLLKDDSRKTVLIA 296 Query: 268 IEDV 271 ++ Sbjct: 297 YQER 300 >UniRef50_Q53W21 Putative uncharacterized protein TTHB145 n=3 Tax=Thermus thermophilus RepID=Q53W21_THET8 Length = 315 Score = 186 bits (472), Expect = 9e-46, Method: Composition-based stats. Identities = 55/275 (20%), Positives = 90/275 (32%), Gaps = 26/275 (9%) Query: 35 IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGA 94 ++ P V + L R+S A+ + G + + G +G Sbjct: 22 EEEGREVAGFPARQVRSVALWGNVRLSTPALVFLLRQGVPVFFYSLEGFLHGVAGAYPDP 81 Query: 95 RSDKLLYQ--------AKLALDEDLRLKVVRKMFELRFGEP---------APARRSVEQL 137 L Q A+ + LR + + R E A +E+L Sbjct: 82 HPAHLRAQFAAEGLPLARAFVVGKLRSALA-LLERHRLPEAGGVVEALARAEGASELERL 140 Query: 138 RGIEGSRVRATYALLAKQYGVT-WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAIL 196 RG EG R + LA+ G + GR P D +N +S + L G A+ Sbjct: 141 RGAEGEGSRVYFQGLARLLGPYGFGGRTRRPPR----DPVNAALSYGYALLLGRVLVAVR 196 Query: 197 AAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDI 254 AG P +GF+H S D+ + + V RR P + + Sbjct: 197 LAGLHPEVGFLHAEGRRSPALALDLMEEFRVPVVDQVVLSAFRRGLLTP-SHAEVREGGV 255 Query: 255 FRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPV 289 + + + +LI L E+ L G P + Sbjct: 256 YLNEEGRRRLIQLFEERLLEGVSHPLGFRKPLGET 290 >UniRef50_B9M5J4 CRISPR-associated protein Cas1 n=9 Tax=Bacteria RepID=B9M5J4_GEOSF Length = 344 Score = 185 bits (470), Expect = 2e-45, Method: Composition-based stats. Identities = 36/312 (11%), Positives = 90/312 (28%), Gaps = 47/312 (15%) Query: 11 LKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++ +++ Q + + + ++ +P+ + ++ +S + Sbjct: 1 MKQLLNTLYVMTQGSYLSLDHETVRVEVNGKLQMQVPLHHLGSVVTFGNVMISPFFLGKC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA 128 A G +V + +G Q + D+ + R M + Sbjct: 61 ADDGRPVVILSRSGRYKCRMVGKTSGNVLLRQAQYEAVRDKGKAAAIARNMVAGKVKNAR 120 Query: 129 ------------PARRS---------------------VEQLRGIEGSRVRATYALL--- 152 P + ++Q+RG+EG A +A+ Sbjct: 121 QVLMRGARESDSPEEETALRKASEIHADTLFRLKSICEIDQVRGLEGESAAAYFAVFDQM 180 Query: 153 ---AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 + + R P D +N +S + + +A+ + G +GF+H Sbjct: 181 VKDGDRAAFAMDNRNRRPP----LDPMNALLSFLYTLVLNDCISAVESVGLDSQMGFLHA 236 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 +P S D+ + + A + R ++ + ++ Sbjct: 237 LRPGRPSLGLDLMEEFRAVIADRLALTLINRKQITEKDFEPRPGGAVYLNEDGRKTVVVA 296 Query: 268 IEDVLAAGEIQP 279 + P Sbjct: 297 YQKRKQEEFFHP 308 >UniRef50_Q57823 Uncharacterized protein MJ0378 n=20 Tax=Euryarchaeota RepID=Y378_METJA Length = 322 Score = 185 bits (470), Expect = 2e-45, Method: Composition-based stats. Identities = 41/293 (13%), Positives = 93/293 (31%), Gaps = 31/293 (10%) Query: 11 LKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 ++ + + L G + + + G + + + + I + +S A+ AQ Sbjct: 1 MRKKSLTL-LSDGYLFRKENTIYFENARGKK-PLAIEGIYDIYIYGKVSISSQALHYLAQ 58 Query: 71 VGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE-------- 121 G L + G + + ++ Q + LD+D RL++ + Sbjct: 59 KGIALHFFNHYGYYDGSFYPRESLHSGYLVVNQVEHYLDKDKRLELAKLFIIGGIKNMEW 118 Query: 122 --LRFG---------EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWN--GRRYDPK 168 L+F E + ++ +EG Y L + + R P Sbjct: 119 NLLKFKNKTKFSSYIEELNNCNKITEVMNVEGRVRTEYYRLWDETLPDDFKIVKRTRRPP 178 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFD 226 E +N IS S LY + P + ++H + S D+++I K Sbjct: 179 KNE----MNALISFLNSRLYPAIITELYNTQLTPTVSYLHEPHERRFSLALDLSEIFKPM 234 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 A + ++ + + R + + + + ++ + Sbjct: 235 IADRLANRLVKQGIIQ-KKHFRDDLNGVLLNKEGMKVVLEHFNKEMDKTVNHK 286 >UniRef50_C0FSR1 Putative uncharacterized protein n=1 Tax=Roseburia inulinivorans DSM 16841 RepID=C0FSR1_9FIRM Length = 359 Score = 184 bits (468), Expect = 2e-45, Method: Composition-based stats. Identities = 50/294 (17%), Positives = 93/294 (31%), Gaps = 40/294 (13%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 + + G + D + + G + IPV + + + +S QVG Sbjct: 31 DHNYHLINEGILTKQDFNILFESENGKKY-IPVETTDSLYIYSNVIMSGNFFDFMNQVGL 89 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKL-LYQAKLALDEDLRLKVVRK-------------- 118 + ++ + G ++ + R+ K L Q ++ E RL + R+ Sbjct: 90 NVSFINKYGEKIGSFVPNNSRRNIKTELKQLRMYDSEKERLDMARRLEIASVSNIRANLR 149 Query: 119 MFELR---------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG---VTW 160 ++ R R + + +E + Y + Sbjct: 150 YYQRRKNATELGAAVKDMTDIITKLNEARDINHMMMLEAQARQKYYGCFNSILEGKQFYF 209 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYD 218 + R P D +N IS + LY I IG VH +P S D Sbjct: 210 DKRTRRPPQ----DPLNAMISFGNTLLYQRIANEINRTSLDIRIGIVHAAGNRPESLNLD 265 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 +AD+ K V F + R + V + I+ +++ I E+ L Sbjct: 266 LADLFKPILVDRTIFTLVNRKMINVNDFVEVENNGIYLNNRAKKIFISEYENKL 319 >UniRef50_B7C8S2 Putative uncharacterized protein n=2 Tax=Eubacterium biforme DSM 3989 RepID=B7C8S2_9FIRM Length = 329 Score = 184 bits (468), Expect = 3e-45, Method: Composition-based stats. Identities = 47/297 (15%), Positives = 96/297 (32%), Gaps = 38/297 (12%) Query: 13 DRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +V ++ + G + D + VL K +IP+ V I+ ++ + L + Sbjct: 2 KKVVYLY-KSGNLKRKDNSLVLESKD-KDDYIPIEQVDMIICFSEVSLNKRVLALLNKYE 59 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR--------------K 118 L+++ G + L+ Q DE RL + + K Sbjct: 60 VLILFYNFYGNYIGRYAPKDYKDGRVLVNQVNKYRDESQRLYISKSILKASIKNMLSVLK 119 Query: 119 MFELRFGEPAP-------------ARRSVEQLRGIEGSRVRATYALLAKQY---GVTWNG 162 + + ++ +L IE + + Y + + Sbjct: 120 YYRKKGKNLDELIRKLEDLVGMASDIETMNELMLIEANAKQTYYKMFDVVLENEEFKFQK 179 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIA 220 R +P E +N +S + LYG+ + + + P I F+H + S YD+A Sbjct: 180 RTKNPPQNE----VNAMLSYGYTLLYGIILSILDRSSLFPQISFIHSLSKNSDSLQYDLA 235 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 DI K + + R+ + + S + I +L + + Sbjct: 236 DIFKPVYIDRMVLRLIRKKQIKQSHFQYKPDGRCYLSKEGTKVFIEEFNILLQSTIV 292 >UniRef50_Q6L317 DNA polymerase n=2 Tax=Thermoplasmatales RepID=Q6L317_PICTO Length = 320 Score = 184 bits (468), Expect = 3e-45, Method: Composition-based stats. Identities = 54/293 (18%), Positives = 99/293 (33%), Gaps = 36/293 (12%) Query: 15 VSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 + Q G+I + K R H+PV +V I++ +S A+ +++G + Sbjct: 2 YNFYITQDGEITRDGNTLYFVGKDFKR-HLPVMNVNEIIISAKVSISSWALDYLSKLGIM 60 Query: 75 LVWVGEAGVRVYASGQPGGAR--SDKLLYQAKLALDEDLRLKVVRKMF---------ELR 123 + + G +S PG + ++ Q + L++D RL + +M LR Sbjct: 61 VHILNIYGN-YMSSLIPGNRNEKGNIIIMQVRSYLNDD-RLYIASQMVLGIKHNILRNLR 118 Query: 124 FGEPA--------------PARRSVEQLRGIEGSRVRATYALLAKQYGVTWNG-RRYDPK 168 + P ++ + G EG+ Y+ Y R + P Sbjct: 119 YYNKNNALDDKIEKISGYYPDGNNINSILGTEGNIWSTYYSAFPYIYKKYHEFKREFHPP 178 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFD 226 D +N IS S LY +I G P+I ++H + S DI+DI K Sbjct: 179 K----DELNAMISFGNSLLYSNVITSIFLNGLNPSISYLHEPSERSFSLALDISDIFKPV 234 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 V + N + + + + I + + Sbjct: 235 IVERVIANLVNNNIIDSN-HFTKENNGYYLNDNGRKIFINYYNQKMNSTINIK 286 >UniRef50_C6CA70 CRISPR-associated protein Cas1 n=56 Tax=Bacteria RepID=C6CA70_DICDC Length = 333 Score = 184 bits (468), Expect = 3e-45, Method: Composition-based stats. Identities = 65/314 (20%), Positives = 117/314 (37%), Gaps = 52/314 (16%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGI---RTHIPVGSVACIMLEPGTRVS 61 L I R ++ +LQ+ +I V G + + G +IP+ + + +ML GT V+ Sbjct: 9 DLKTILHSKRANIYYLQHCRILVNGGRVEYVTEEGNQSLYWNIPIANTSVVMLGTGTSVT 68 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-----------GQPGGARSDKLLYQAKLALDED 110 AA+R A+ G ++ + G G ++A+ Q ++ L +E Sbjct: 69 QAAMREFARAGVMVGFCGSGGTPLFAANEAEVAVSWLSPQSEYRPTEYLQDWVSFWFNEQ 128 Query: 111 LRLKVVRKMFELRFGE----------PAPARRSV-----EQLRGI--------------- 140 RL ++R G+ +R ++ E L Sbjct: 129 QRLAAAIAFQQVRIGQIRQHWLGGRLARESRFTIKPEHVEALLNRYQQGLVDCRTSNDVL 188 Query: 141 --EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAA 198 E +A Y L A G K D N+ + YG+ A+ Sbjct: 189 VQEAMMTKALYRLAANAVSY---GDFTRAKRGGGTDLANRFLDHGNYLAYGLAAVALWVL 245 Query: 199 GYAPAIGFVH-TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 G + +H + V+D+AD+IK ++P+AF A +++ R C FR Sbjct: 246 GLPHGLAVLHGKTRRGGLVFDVADLIKDALILPQAFIAAMEGE--DEQDFRQRCLTAFRQ 303 Query: 258 SKTLAKLIPLIEDV 271 ++ L +I ++ V Sbjct: 304 AEALDVMIDSLQQV 317 >UniRef50_Q0AW57 CRISPR-associated protein, Cas1 family n=1 Tax=Syntrophomonas wolfei subsp. wolfei str. Goettingen RepID=Q0AW57_SYNWW Length = 336 Score = 183 bits (466), Expect = 4e-45, Method: Composition-based stats. Identities = 41/306 (13%), Positives = 89/306 (29%), Gaps = 41/306 (13%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHI-PVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S +++ + +I V + V+ K I P+ V +++ +S V+ + Sbjct: 1 MSFLYVYERSAKIGVQENCVVVESKKENLKRILPIEGVENVIIFGDASLSSNCVKQFMER 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA--- 128 L W+ G + Q D++ L + +++ + Sbjct: 61 DINLTWLSSRGKFYGRLESTRNVNIYRQRKQFACGEDDEFCLALAKRIILAKVKNQITIL 120 Query: 129 --------------------------PARRSVEQLRGIEGSRVRATYALLAKQY--GVTW 160 + ++L G EG R Y LA+ + Sbjct: 121 RRYRRNRPEKSVQKIIDAMAKLLPIMERVHNKDELMGHEGMAARYYYQGLAELVEPDFAF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 +GR P D N +S A + L A + G P F+H+ + + D Sbjct: 181 SGRNRQPPR----DPFNSLLSFAYTLLMYDLYTAAVNRGLNPYASFLHSIRRGHPALCSD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRL-ACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 + + + A + + + + + ++ I E + Sbjct: 237 LMEEWRAILADSLALYVTSKGIIKRENFEKPNEEGGVYLDGIGSKAFIAEYEKKVRGRSN 296 Query: 278 QPPAPP 283 Sbjct: 297 YLAYVD 302 >UniRef50_B8G918 CRISPR-associated protein Cas1 n=5 Tax=Chloroflexi (class) RepID=B8G918_CHLAD Length = 339 Score = 183 bits (466), Expect = 5e-45, Method: Composition-based stats. Identities = 54/318 (16%), Positives = 101/318 (31%), Gaps = 48/318 (15%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ +++ Q +I + I +P+ + I++ +S A++ G Sbjct: 1 MATLYVIEQGAEIGCDGERIEVRRGADIIGSVPLVKLDDIVIFGNVGISTPAMKRLLDRG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP----- 127 + ++ G A Q A D L + ++ E + Sbjct: 61 IEVTFMTVDGRYQGRLIGQVTAHVALRHAQYACAADPARALALAQRFVEGKLRNQRALLQ 120 Query: 128 --------------------------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWN 161 + L G+EGS +A L G W+ Sbjct: 121 RFSRNRAEPPPEAQAAADDLEAYIKRVKRTTQLSSLLGVEGSATARYFAGLRSLIGPEWS 180 Query: 162 --GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFV 216 GR+ P D +N +S + L A+ AAG+ P +GF+H+ G+P S Sbjct: 181 FSGRQRRPPP----DPVNLLLSLGYTLLAHKVLGAVQAAGFDPYLGFLHSLDYGRP-SLA 235 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVR--LACRDIFRSSKTLAKLIPLIEDVLAA 274 DI + + + I P+ R R I + + + E+ + Sbjct: 236 LDIMEEFRPILIDSLVVRICNDGRIRPE-HFRPGEGERPIIITDEGKRAFLTAFEERMRT 294 Query: 275 GEIQPPAPPEDAQPVAIP 292 P D+ P +P Sbjct: 295 EATHPEGA--DSGPGKVP 310 >UniRef50_A4X3M4 CRISPR-associated protein, Cas1 family n=4 Tax=Actinomycetales RepID=A4X3M4_SALTO Length = 326 Score = 182 bits (462), Expect = 1e-44, Method: Composition-based stats. Identities = 52/290 (17%), Positives = 91/290 (31%), Gaps = 30/290 (10%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 S + +I D + + G IP+ + ++L ++ AAV L ++ G + Sbjct: 7 SYWLTEPCRIRREDNSIRIERADGQPVRIPITDIRDLVLFDNADINTAAVSLLSRHGVTV 66 Query: 76 VWVGEAGVRVYASGQPGG-ARSDKLLYQAKLALDEDLRLKVVRKMFEL------------ 122 + G A + + + Q L + RL V + + Sbjct: 67 HLLDHYGNYAGALTPADDMSSAHVVRAQVALTGNPQARLAVAQALVRATAVNVAWALGTD 126 Query: 123 -------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGRRYDPKDWEKG 173 R A S L G+EG+ R + +L + +GR P Sbjct: 127 LLDGPLERLPAQIGASTSSGDLMGVEGNFRRTAWGVLDTLLPPWLRLDGRTRRPP----S 182 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG---KPLSFVYDIADIIKFDTVVP 230 + N IS + Y AI PAIGF+H + + D+A+ K Sbjct: 183 NAGNAFISYLNAITYARVLTAIRCTPLHPAIGFLHADTDRRRNTLALDLAEPFKPLLAER 242 Query: 231 KAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 A + + R S K+ ++ + LA Sbjct: 243 LLRRAAAQRTLTA-ADFVSDVRSASLSQAGRKKIAVMVREELATTVQHRQ 291 >UniRef50_C9RRG3 CRISPR-associated protein Cas1 n=1 Tax=Fibrobacter succinogenes subsp. succinogenes S85 RepID=C9RRG3_FIBSS Length = 343 Score = 182 bits (461), Expect = 2e-44, Method: Composition-based stats. Identities = 46/322 (14%), Positives = 92/322 (28%), Gaps = 44/322 (13%) Query: 15 VSMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 + + + G ++ G V+ I + V + L + A+ G Sbjct: 1 MHLYVTEQGTRLGKNGGHLVVQRDGCTIDDILLSEVDSLSLFGAVHPTTDAMLALLDKGA 60 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL----------- 122 + ++ G G L Q + D+D + + Sbjct: 61 DIAFLSSGGRYRGRLVSAVGKNVPLRLCQYDVFRDDDRAFALAKSCVVRKLENGLRVLEA 120 Query: 123 ---------RFGEPAP-----------ARRSVEQLRGIEGSRVRATYALLAKQY--GVTW 160 RF ++LRG EG+ R + + G+ + Sbjct: 121 YSKNPHNSFRFENRDEYLRNLNAVRRLQGFDRDELRGFEGNGARIYFENFGRCLACGLDF 180 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYD 218 GR+Y P D +N +S + E+ + + G P +G++H S D Sbjct: 181 PGRKYHP----STDPVNALLSFGYTLTARSLESLLESYGMDPMLGYLHEPSYGRNSLAQD 236 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRL----ACRDIFRSSKTLAKLIPLIEDVLAA 274 + + + V + R D + + +F + + + ED +A Sbjct: 237 MLEEFRHPLVDRLVLFLFNRRVLVADDFEQRNDENSSGQLFLKPEKMRVFLHHYEDFVAR 296 Query: 275 GEIQPPAPPEDAQPVAIPLPVS 296 + L V Sbjct: 297 PNGIYQGLANLPWRSVMRLRVE 318 >UniRef50_B5IHG3 CRISPR-associated protein Cas1 n=3 Tax=Aciduliprofundum boonei T469 RepID=B5IHG3_9EURY Length = 321 Score = 181 bits (460), Expect = 2e-44, Method: Composition-based stats. Identities = 52/305 (17%), Positives = 95/305 (31%), Gaps = 31/305 (10%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 S+ + I L+ K G + IP+ ++ I S A++ G ++ Sbjct: 3 SLYITKEAIIKREANTIYLVRK-GEKRSIPIHNLRDITCIAPVSFSSGAIKHVLNSGVVV 61 Query: 76 VWVGEAGVRVYA-SGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE------------- 121 + G + + ++ QAK + + R + ++M E Sbjct: 62 HFFDMYGNYEGTLYPRERSISGEVVVNQAKHYIFWEKRKYIAKEMIEGIKHNILRNLKKS 121 Query: 122 --------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEK 172 + S+E+L E Y L R Y P Sbjct: 122 NKELEEIIEKIERVEVEGDSIEELMNREAQIWGYYYKSLDYTLKKFQLERRDYRPP---- 177 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVP 230 + +N IS + LY I P+I ++H + S DIAD+ K V Sbjct: 178 INELNALISFGNTLLYSAVLTEIYHTHLNPSISYLHEPSERRFSLSLDIADVFKPTMVYR 237 Query: 231 KAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVA 290 +I + + R +F + + K I ED +++ +Q Sbjct: 238 HIHDIVNHGIIT-EDDFRKEFNGVFLNEQGKRKYIRKWEDRMSSTVYHRRLKRSVSQRGL 296 Query: 291 IPLPV 295 + L Sbjct: 297 LRLEA 301 >UniRef50_UPI0000F51762 hypothetical protein Faci_00015 n=1 Tax=Ferroplasma acidarmanus fer1 RepID=UPI0000F51762 Length = 328 Score = 181 bits (460), Expect = 2e-44, Method: Composition-based stats. Identities = 42/290 (14%), Positives = 87/290 (30%), Gaps = 32/290 (11%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + + V G ++ + G IPV +V + +S V + +++G + Sbjct: 6 YHIIHESTVYVDGGTIIIENSIGKNA-IPVENVRSVYAHKPVSISSGVVSIISKLGIPIH 64 Query: 77 WVGEAGVRVYASGQ-PGGARSDKLLYQAKLALDEDLRLKVVRKMFE-------------- 121 + G D ++ QA+ LD R+ + + Sbjct: 65 FFNWYGNYEATLWPKSKDISGDVIIKQAQKYLDIHERINIAKSFVSGALHNFNRILSEYD 124 Query: 122 ----LRFGEPAPA-------RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDW 170 + E + + ++ GIEG + + + + R Sbjct: 125 TETVKKSREDIKSNIGNLTNAMDITEIMGIEGRSHNSYFKAMDSVIPEKF--RINKRIRR 182 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTV 228 G+ N IS S +Y I P I ++H + S DIA+I K Sbjct: 183 PPGNMGNALISFGNSLVYASVLTEIYFTHLNPTISYLHEPSERRFSLSLDIAEIFKPIIS 242 Query: 229 VPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 + + + + + + + K + E+ L + Sbjct: 243 HKLFLYLINKKIIN-ENDFDNSLEKVILNEKGKKLYLKNYEEKLTSTVFH 291 >UniRef50_Q65S18 Putative uncharacterized protein n=1 Tax=Mannheimia succiniciproducens MBEL55E RepID=Q65S18_MANSM Length = 333 Score = 181 bits (460), Expect = 2e-44, Method: Composition-based stats. Identities = 46/293 (15%), Positives = 94/293 (32%), Gaps = 39/293 (13%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + +++ ++ V + +K IP+ SV + ++ + + + + G Sbjct: 1 MPTLYIDRRTTELKVNGDVLICYEKGERIATIPLASVDRLYMKGDINLQISLLSKLGEKG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR--------- 123 +V++ + + + + Q LA ++ L + + + + Sbjct: 61 IGVVFLQGRKNKPMQFLPQPHNDAYRRVTQTYLADNKLFCLTLAKNIVLNKCIKQCQFLA 120 Query: 124 ------------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGR 163 + +++ LRGIEG +A A + +NGR Sbjct: 121 KFIEHNPKIITFIAELQKLFNLIVKQENIDSLRGIEGRMGAIYFAAFADILPRSLGFNGR 180 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIAD 221 P D +N +S + LY A+ AG P IGF HT S D+ + Sbjct: 181 NRRPPK----DPVNAVLSLTYTLLYSEATLAVYGAGLDPYIGFFHTLHFGRKSLSCDLME 236 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 I+ V E + D+ + E V++ Sbjct: 237 PIRPS-VDEWIAECFTAEVLKIDQ-FSQTNEGCILGKEGRVIFYTAFEKVVSE 287 >UniRef50_C7RP03 CRISPR-associated protein Cas1 n=1 Tax=Candidatus Accumulibacter phosphatis clade IIA str. UW-1 RepID=C7RP03_9PROT Length = 339 Score = 180 bits (458), Expect = 4e-44, Method: Composition-based stats. Identities = 52/343 (15%), Positives = 106/343 (30%), Gaps = 63/343 (18%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ +++ + A V + +P+ ++ + + +S A + + G Sbjct: 1 MTSLYVDRRGVTLKADGEALVFYENGERIGTVPLAPLSRVFMRGDVTLSSALLGKLGERG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP----- 127 +V + + + + Q +L+LD D L+ R + E + Sbjct: 61 IGVVVLSGRKAVPTMLLGRPHNDAARRVAQYRLSLDGDFCLRFARAIVEAKLRAQATFLT 120 Query: 128 --------------------------APARRSVEQLRGIEGSRVRATYALLAKQYG--VT 159 + S+ LRG+EG+ A + A + Sbjct: 121 ERRDSEPHSRYLLTLSLRRLATSIAAVDEQGSLGSLRGLEGAGAAAYFEGFADLLPERLK 180 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVY 217 ++GR P D +N +S + L+ A+ AG P +GF H S Sbjct: 181 FSGRNRRPPR----DPVNAMLSLGYTLLHAEAVLALYGAGLDPFVGFYHALDFGRESLAC 236 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDV------ 271 D+ + ++ + V A + R + A+ E+V Sbjct: 237 DLVEPLRVE-VDRHALMLFRSEKLRAEG-FSTTESGCLLGKAGRARFYGEWEEVAARLRK 294 Query: 272 -------------LAAGEIQPPAPPEDAQPVAIPLPVSLGDAG 301 + A ++ P P P A V+ + G Sbjct: 295 LLAESVSDVASAIMQAAAVEAPCSPPIGDPAA-DAGVADEETG 336 >UniRef50_A4FJX8 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Saccharopolyspora erythraea NRRL 2338 RepID=A4FJX8_SACEN Length = 549 Score = 180 bits (458), Expect = 4e-44, Method: Composition-based stats. Identities = 47/314 (14%), Positives = 85/314 (27%), Gaps = 45/314 (14%) Query: 4 LPLNPI-PLKDRVSMIFLQYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVS 61 P + R + + G + + + + V ++ +V+ Sbjct: 207 KPRKLLARAPARDPVYVTEPGTTVGIRSERLAVRKDHEELLSCRLRDVLHLVAAGPVQVT 266 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE 121 AV A+ G+ +VW G G + Q + L R++ Sbjct: 267 SQAVHALAEQGSPVVWTSTTGRLKSVDIPTVGKHVELRRRQFTA--TPNTALDFARRIVG 324 Query: 122 LRFGEPA-------------------------PARRSVEQLRGIEGSRVRATYALLAKQY 156 + + L GIEG+ R +A L + + Sbjct: 325 GKIRNARTLLRRNPHVEEPDLLNRLDADAVRAERAPTRSTLLGIEGAAARTYFAGLVETF 384 Query: 157 GVTWN---------GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 GR P D ++ +S L A A G P GF Sbjct: 385 RTDHRLPGPAFDTMGRTRRPPR----DAVSCLLSFLYCLLIKDITTACYALGLDPYFGFY 440 Query: 208 HTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLI 265 H + + D+A+ + A + +P + + +S +I Sbjct: 441 HQPRHGRPALTLDLAEEFRPLIADSTALTLINNLQADPAM-FHVHSTAVALTSSGRRDVI 499 Query: 266 PLIEDVLAAGEIQP 279 E L I P Sbjct: 500 DAYERRLTTEVIHP 513 >UniRef50_D1N0J7 CRISPR-associated protein Cas1 n=1 Tax=Victivallis vadensis ATCC BAA-548 RepID=D1N0J7_9BACT Length = 347 Score = 180 bits (457), Expect = 5e-44, Method: Composition-based stats. Identities = 42/313 (13%), Positives = 86/313 (27%), Gaps = 53/313 (16%) Query: 18 IFLQY-GQIDVIDGAFVLIDKTGIRTH-------------------IPVGSVACIMLEPG 57 ++L G++ D T +PV S+ + + Sbjct: 3 LYLTRSGRLRRRDNTLRFERVNLPETEDPEVEAEALEEGKADAVQALPVESIDAVYVFGE 62 Query: 58 TRVSHAAVRLAAQVGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVV 116 V+ + + + G D ++ Q + + + RL + Sbjct: 63 LSVNTKLINFLNCKKVPVHFFNWYGHHTGTLLPHAEQLSGDLVIRQGEAYRNPEERLMIC 122 Query: 117 R--------------KMFELR-------------FGEPAPARRSVEQLRGIEGSRVRATY 149 R + ++ R + + + E L G+EG+ + Y Sbjct: 123 RNLLEAVFHNILSVLQYYQRRKGGLEKSIAAVKELEQKLAEQDTPEGLMGLEGNVRKLYY 182 Query: 150 ALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 G + + +N IS S LY + + P I ++HT Sbjct: 183 QSWPVWLGK--TAEHFKRVYHPPDNPLNALISFLNSLLYTACVSELYRTALYPGISYLHT 240 Query: 210 G--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 + S D+ + K V F + DR+ R + +++ Sbjct: 241 PQTRRFSLALDLVEPFKPLLVDRMIFRLLDSRAIG-DRDFRKHSNGFLLTDDARRRILQE 299 Query: 268 IEDVLAAGEIQPP 280 + L P Sbjct: 300 WDQELRRTVHYAP 312 >UniRef50_D0MJ58 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MJ58_RHOM4 Length = 360 Score = 179 bits (455), Expect = 1e-43, Method: Composition-based stats. Identities = 44/341 (12%), Positives = 80/341 (23%), Gaps = 65/341 (19%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIR-------------------THIPVGSVACIML 54 + G++ G R T PV V + Sbjct: 2 KRPYYIFSSGRLRRRQNTLFFEKAAGERIPDDQDETGVPSGTPTGESTPFPVEQVESLYF 61 Query: 55 EPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRL 113 ++ + A+ + G + + Q + R Sbjct: 62 FGEVDLNSKLLTFLARHDIPAHFYDYYGNYTGTYIPRDYLHSGRLRIEQVLHYVRPKRRR 121 Query: 114 KVVR--------------KMFELRFGEP------------------APARRSVEQLRGIE 141 + R + + R + + +L GIE Sbjct: 122 YLARAIVEAATYNLLRVLRYYVNRLEGERREAVAEAIATIEQERTQLRSAEKIPELMGIE 181 Query: 142 GSRVRATYALLAKQYG------VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAI 195 G A Y+ + R P + IN IS S Y T I Sbjct: 182 GRSREAYYSAWPSILADGPGEAFVFEKRERRPP----SNEINALISFGNSLCYTTTIRQI 237 Query: 196 LAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRD 253 P I ++H + S D+++I K V F + + P R Sbjct: 238 HRTALDPTISYLHEPGARRFSLALDLSEIFKPILVDRAIFRLVKTGEITP-RHFEERLGG 296 Query: 254 IFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQPVAIPLP 294 ++ + + ++ L + I L Sbjct: 297 VYLKEEGRRIFVRHWDERLRQTVYHRRLERHVSYERLIRLE 337 >UniRef50_C8Q0H7 CRISPR-associated protein Cas1 n=6 Tax=Proteobacteria RepID=C8Q0H7_9GAMM Length = 334 Score = 179 bits (454), Expect = 1e-43, Method: Composition-based stats. Identities = 57/305 (18%), Positives = 118/305 (38%), Gaps = 42/305 (13%) Query: 7 NPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKT----GIRTHIPVGSVACIMLEPGTRVSH 62 P+ L R + +L+ ++ + D V + ++ +IP + A ++L G+ ++ Sbjct: 27 RPLMLSKRACVFYLERVRVILKDDRIVYLTESMQPIEHFYNIPEKNTAFLLLGKGSSITD 86 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYAS-------GQPGGARSDKLLYQAKLALDEDLRLKV 115 AA R A+ ++ + G G ++++ Q ++ + K LD+ RL + Sbjct: 87 AAARRLAESNVMVGFCGSGGSPLFSALDLTFLAPQSEYRPTEYMQIWMKAWLDDTTRLLM 146 Query: 116 VRKMFELR----------------------------FGEPAPARRSVEQLRGIEGSRVRA 147 + + + R F + + + EQL EG + Sbjct: 147 AKVLLQERIEIVKKYWQKNPLLTSYGIRLDESAVVNFSQAIESAMNQEQLLTAEGRWAKV 206 Query: 148 TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 Y LA+ G + + + D N + YG A+ G + A+ + Sbjct: 207 LYKSLAEGCGFKFTREEGKNANDDIADIANSYLDHGNYIAYGYAAVALHGLGISFALPML 266 Query: 208 H-TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIP 266 H + V+D+AD++K V+P+AF A+ G +E R+ +I + L + Sbjct: 267 HGKTRRGGLVFDVADLVKDAMVMPQAFISAK--LGHNQKEFRMQLIEICQDQDVLDYMFG 324 Query: 267 LIEDV 271 + + Sbjct: 325 FVVET 329 >UniRef50_UPI0001C41A73 CRISPR-associated protein Cas1-2 n=1 Tax=Methanobrevibacter ruminantium M1 RepID=UPI0001C41A73 Length = 334 Score = 178 bits (453), Expect = 2e-43, Method: Composition-based stats. Identities = 44/289 (15%), Positives = 80/289 (27%), Gaps = 42/289 (14%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + D V+ + + ++ I+L ++ A+ L A+ V + G Sbjct: 12 VAKRDNQIVIKENGKEINYYLAKDISQILLTGKGSITFDALTLLAENDVDCVSINWKGHV 71 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP----------------- 127 Y P + Q D + + + Sbjct: 72 DYRLSAPDRKNAIVKKEQYFALTDSRSG-YLAKAFVRAKIENQKAVLGTLAKSREEKDYI 130 Query: 128 -------------------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPK 168 + + GIEG ++ A W + Sbjct: 131 IEQREKVSEHIGKIEKLSNINSDNIRNNILGIEGQASHEYWSAFASVLDEKWEF--FGRS 188 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFD 226 D +N ++ + + +I AG P GF+H+ + S VYD+ + + Sbjct: 189 GRGAKDPVNSLLNYGYAVIESEIWKSIYLAGLDPYCGFLHSERYGRASLVYDLIEEFRQQ 248 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 V I RN PD + I + LI I D L + Sbjct: 249 IVDKTVLSIVNRNQITPD-DFEEDGNYIKIHERARRLLIAKILDKLNSK 296 >UniRef50_D1VVR4 CRISPR-associated endonuclease Cas1, DVULG subtype n=1 Tax=Peptoniphilus lacrimalis 315-B RepID=D1VVR4_9FIRM Length = 343 Score = 178 bits (452), Expect = 2e-43, Method: Composition-based stats. Identities = 38/310 (12%), Positives = 87/310 (28%), Gaps = 45/310 (14%) Query: 11 LKDRVSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 +K ++++++ + V+ + P+ + ++ VS +++ Sbjct: 1 MKKLLNVLYVTSPDYYLKKEGRNIVISLEGKRVARYPIHILRQVVCFNYMGVSPDLMKMC 60 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDED-----LRLKVVRKMFELR 123 + + + G Q KLA E+ + + + Sbjct: 61 MEENVSISFFTPYGKYCGRVVGASYGNIYNRKRQYKLAESEESLQFVKNIIYAKAYNSRK 120 Query: 124 --FGEPAPARRSV-------------------------EQLRGIEGSRVRATYALL---- 152 + + + +RGIEG+ R +++L Sbjct: 121 IIIRGKLDHKDKIDLEKVENVISNIKDLMIQIQYSPDKDSIRGIEGTIARQYFSVLDEFI 180 Query: 153 -AKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK 211 ++ + R P D N +S S L +A+ G GF HT + Sbjct: 181 VKQREDFYFIERTKRPPR----DRFNAMLSFMYSILTNSIASALEGVGIDSYAGFFHTDR 236 Query: 212 PL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIE 269 P S DI + ++ + + N + + K ++ Sbjct: 237 PGRVSMALDIIEEMRAFIIDKFCLSMINLNRINKNNFEIKENGACLLNDKGREIILQNWN 296 Query: 270 DVLAAGEIQP 279 + P Sbjct: 297 KKQQEEILHP 306 >UniRef50_A9BUF1 CRISPR-associated protein Cas1 n=33 Tax=Proteobacteria RepID=A9BUF1_DELAS Length = 337 Score = 178 bits (452), Expect = 2e-43, Method: Composition-based stats. Identities = 65/334 (19%), Positives = 121/334 (36%), Gaps = 56/334 (16%) Query: 5 PLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIR---THIPVGSVACIMLEPGTRVS 61 L I R ++ +L++ ++ V G + G R +IP+ + I+L GT ++ Sbjct: 8 DLKTILHSKRANIYYLEHCRVLVNGGRVEYVTDAGKRSLYWNIPIANTTSILLGAGTSIT 67 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYAS-----------GQPGGARSDKLLYQAKLALDED 110 AA+R A+ G L+ + G G ++++ Q ++ L + D++ Sbjct: 68 QAAMRELAKAGVLVGFCGGGGTPLFSANEVDVEVAWLTPQSEYRPTEYLQAWVQFWFDDE 127 Query: 111 LRLKVVRKMFELR-------------------------------FGEPAPARRSVEQLRG 139 LRL +++ LR F V L Sbjct: 128 LRLHAAKQLQALRLQRLQQEWGARALRESGFAVDMERLKALVQQFAALMANAPDVMTLLT 187 Query: 140 IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAG 199 E +A + L G G K D N+ + YG+ A G Sbjct: 188 DEARLTKALFKLAVDAVGY---GEFTRAKRGTGTDGANRYLDHGNYLAYGLGATATWVLG 244 Query: 200 YAPAIGFVH-TGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 + +H + V+D AD++K ++P+AF A R + + + R C + S Sbjct: 245 LPHGLAVLHGKTRRGGLVFDAADLVKDAAILPQAFLSAMRG--DDEPQFRRQCIEALTRS 302 Query: 259 KTLAKLIPLIEDVL-----AAGEIQPPAPPEDAQ 287 ++L +I ++ + AG P ++A Sbjct: 303 ESLDFIIDTLKRIAVDTARMAGASATPGAAQEAG 336 >UniRef50_Q6L363 DNA polymerase n=1 Tax=Picrophilus torridus RepID=Q6L363_PICTO Length = 318 Score = 178 bits (451), Expect = 2e-43, Method: Composition-based stats. Identities = 43/282 (15%), Positives = 103/282 (36%), Gaps = 26/282 (9%) Query: 15 VSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 + + G ++ + + + IP+ ++ I++ ++ A+ + ++ Sbjct: 2 QTYYIISSGTLNREMDSLRFSNSNENKI-IPLENIESIIISGNVSITKPAISILSKKNIP 60 Query: 75 LVWVGEAGVRVYASGQPGGA-RSDKLLYQAKLALDEDLRLKVVR---------------- 117 + ++ + + ++ QA A + D R+K+ R Sbjct: 61 VFFMSMYDNYISSLIPEDYLLSGKVIMNQAIKAYNIDERIKIARIFVYAAARNMAIVLKR 120 Query: 118 -KMFELRFG-EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 M +++ + +S+ +L EG+ R + + QY + N + + Sbjct: 121 GNMGKIKIPYKEIMESKSINELMSYEGNF-RNNFLNIVDQY-LPNNYKIIKRSRRPPRNK 178 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAF 233 +N IS LY +TE+ I PAI F+H + S D+++I K A Sbjct: 179 MNALISYLNMLLYSITESQIFLTHLNPAISFLHEPFERRNSLSLDVSEIFKPLICDRLAI 238 Query: 234 EIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 ++ + + L +F + K++ + +D + Sbjct: 239 KMVKLKIIKDSDF--LEENGVFLNENGRKKVVKMFDDKMQET 278 >UniRef50_B7A8Y4 CRISPR-associated protein Cas1 n=1 Tax=Thermus aquaticus Y51MC23 RepID=B7A8Y4_THEAQ Length = 316 Score = 177 bits (449), Expect = 4e-43, Method: Composition-based stats. Identities = 49/278 (17%), Positives = 82/278 (29%), Gaps = 34/278 (12%) Query: 35 IDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGA 94 +K P V + + R+S A+ + G + ++ G +G Sbjct: 22 EEKGVKVGSFPARQVRRVAVWGNVRLSTPALVFLLRQGAPVFFLSLEGFLYGVAGAFPDP 81 Query: 95 RSDKLLYQAKLALDEDLRLKVVRKMFELR------------FGEPAP---------ARRS 133 L Q L + R + E + Sbjct: 82 HPAHLRAQFGA-----EALPLARAFVLGKLRSALALLLRHGLPEAEEVAQALARAGEAQR 136 Query: 134 VEQLRGIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTE 192 +E LRG EG RA + L + + GR P D +N +S + L G Sbjct: 137 LESLRGAEGEGSRAYFQGLGRLLAAQGFGGRTRRPPR----DPVNAALSYGYALLLGRVL 192 Query: 193 AAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLA 250 A+ AG P +GF+H S D+ + + V RR P Sbjct: 193 VAVRLAGLHPEVGFLHAEGRRSPALALDLMEEFRVPVVDAVVLSAFRRGHLTP-AHAEAR 251 Query: 251 CRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPEDAQP 288 ++ + + +LI L+E P + Sbjct: 252 EGGVYLNEEGRRRLIELLEGRFLEEVAHPLGFRKPLGE 289 >UniRef50_B8GSH8 CRISPR-associated protein Cas1 n=1 Tax=Thioalkalivibrio sp. HL-EbGR7 RepID=B8GSH8_THISH Length = 691 Score = 176 bits (446), Expect = 9e-43, Method: Composition-based stats. Identities = 40/287 (13%), Positives = 78/287 (27%), Gaps = 38/287 (13%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGE 80 + ++ + G + P+ ++A + L +VS + G L Sbjct: 371 EPARVGLDGGRLRVQRGEAELLSAPLETLAGVTLLGPHQVSTQLLGALLDRGIPLALATG 430 Query: 81 AGVRVYASGQPGGARSD--KLLYQAKLALDEDLRLKVVRKMFELRFGEPAP--------- 129 G + + QA D+ L+ R + + R + Sbjct: 431 QGRLRGVLWNGVPGDAGPGLWMRQAACFEDDARALEAARAVVDARLRQQREVLRNRMSPE 490 Query: 130 -----------------ARRSVEQLRGIEGSRVRATYALLAKQYGVT--WNGRRYDPKDW 170 A L G+EG R + LA+ + GR P Sbjct: 491 RCDDLLPRLDRLIAKTAAASDRASLNGLEGQAARLYFGALAELLPPELGFTGRNRRPPR- 549 Query: 171 EKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDTV 228 D N +S + L+ + + G P GF H + D+ + + V Sbjct: 550 ---DPFNVLLSLGYTVLHAHVDTVVRLNGLYPWRGFYHQPHGLHPALASDLMEPFR-HLV 605 Query: 229 VPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 A + R + + + + + + L Sbjct: 606 ERVALNVVARGRIRV-SDFAQQGDACRIEAGARRRYLADLSERLLTP 651 >UniRef50_C8W3G7 CRISPR-associated protein Cas1 n=24 Tax=Bacteria RepID=C8W3G7_DESAS Length = 335 Score = 176 bits (446), Expect = 9e-43, Method: Composition-based stats. Identities = 42/290 (14%), Positives = 81/290 (27%), Gaps = 37/290 (12%) Query: 18 IFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVW 77 G + D + + G IP+ + +S ++L A+ G ++ + Sbjct: 9 YIFSAGDLYQKDFSIAFRKEDGNFY-IPIKDTRELYCFNDITLSTKLLQLLAKAGIVVHF 67 Query: 78 VGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLK------VVRKMF---------- 120 G + + + QA L++ + + + Sbjct: 68 FGYYENYIGTFYPKEQLLSGRLTVAQALAYEQNRLQIAGQIIKGIAKNTYFVLYHYYRHG 127 Query: 121 -----------ELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW--NGRRYDP 167 + +++QL IEG Y ++ N R P Sbjct: 128 KSELKDFLDWLRKDVSRLVDSVGNIKQLLRIEGEIWARFYQSFRVFLPESFAMNKRVKRP 187 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKF 225 D + IN IS + LY T I I F+H + S D++++ K Sbjct: 188 PD----NPINALISFGNTLLYTKTITQIFHTHLNQTISFLHEPAERRFSLSLDLSEVFKP 243 Query: 226 DTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 V F+ ++ + I E+ L Sbjct: 244 VLVCKTIFDCVNNRKIMVEKHFDKKLNYALLNELGRKVFIEAFEERLNQT 293 >UniRef50_Q74N45 NEQ017 n=1 Tax=Nanoarchaeum equitans RepID=Q74N45_NANEQ Length = 333 Score = 175 bits (445), Expect = 1e-42, Method: Composition-based stats. Identities = 52/297 (17%), Positives = 96/297 (32%), Gaps = 42/297 (14%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++ L G++ I+K ++ IP+ S+ I + V++ A++L A + Sbjct: 3 TIYILSIGKLYRGKNGLYFINKDKKKSPIPLESIKEIFILNKVSVTYNALKLLADRNIPI 62 Query: 76 VWV---GEAGVRVYASGQPG---GARSDKLLYQAKLALDEDLRLKVVRKMF--------- 120 + + G+ Y L+ Q + D + R ++ ++ Sbjct: 63 HFFYENTKKGISYYLGSFLPRQKTKSGLVLVKQVEAYKDIEKRTEIALEIVDAIRYNCIK 122 Query: 121 -----------ELR-------FGEPAPA-RRSVEQLRGIEGSRVRATYALLAKQYGVT-W 160 ELR F E + ++ +RGIE + Y L K + Sbjct: 123 VLEKYHIDEVKELRKIDVWKMFEESLNDWKDAINIIRGIESNIWNLFYQGLDKILKLYKL 182 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYD 218 R P E N +S A + LYGVT I P I F+H S D Sbjct: 183 ERRTRRPPKNEA----NTIVSFANTLLYGVTLTEIYKTHLDPTISFLHELRDTRYSLALD 238 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 +++ K + + + D + + + +I L Sbjct: 239 LSENFKPIITFRILIWLVNQGIIK-DTHFVKGLNGVLLNEQGKKLVIKEFNKRLDET 294 >UniRef50_Q8YZS6 Alr0381 protein n=6 Tax=Cyanobacteria RepID=Q8YZS6_ANASP Length = 374 Score = 175 bits (444), Expect = 2e-42, Method: Composition-based stats. Identities = 45/310 (14%), Positives = 93/310 (30%), Gaps = 45/310 (14%) Query: 15 VSMIFL-QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 ++ ++L + G I ++I K + + + I++ PG +++ + G Sbjct: 39 MTTLYLTEPGTILRYRNESLIIMKQEKSHNCRLAEITLIVVLPGVQLTDVVISQLLDRGI 98 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP---- 129 +++ + G L Q + + + +K+ + Sbjct: 99 ETIFLRQDGQFRGRLQGHFATNMTIRLAQYRTVET-TFGMALAQKLVIGKVRNQRVLLQR 157 Query: 130 ------------------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVT 159 + +L G+EG R Y L + Sbjct: 158 RNRATNGQISELTEAIDLISVYASQLNNTTTPLNRNELMGVEGICARTYYQALKHWFPTQ 217 Query: 160 WN--GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSF 215 WN GR P D IN +S L +A + AG P +GF H + Sbjct: 218 WNFNGRNRRPP----LDPINALLSWGYGVLLARVFSACVQAGLDPYLGFFHAIEPYRPNL 273 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREV-RLACRDIFRSSKTLAKLIPLIEDVLAA 274 V D+ + + V + + + + I+ + L+ +E Sbjct: 274 VLDLMEEFRPVVVDQAVISLIQSDLLTQEDFQPSPDGVGIWLGTMGKKLLLGELERRFRT 333 Query: 275 GEIQPPAPPE 284 + PP + Sbjct: 334 SVLYPPQNRQ 343 >UniRef50_Q2LQX3 Uncharacterized protein predicted to be involved in DNA repair n=1 Tax=Syntrophus aciditrophicus SB RepID=Q2LQX3_SYNAS Length = 386 Score = 173 bits (440), Expect = 5e-42, Method: Composition-based stats. Identities = 45/315 (14%), Positives = 91/315 (28%), Gaps = 37/315 (11%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + Q +I ++ + + + +++ ++H A+ + G +V Sbjct: 4 YVRTQGARIIKEGRHLLVRKGDAVYHTLFTYKLDQLVIFGNVEITHQALAQLMRYGIDVV 63 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP------- 129 ++ G + P Q L +E L+ VR + + A Sbjct: 64 FLSFRGRYLGRISPPESKNVFLHKRQYSLLGNETFTLRQVRAIVAGKLANMATLLMRIKR 123 Query: 130 ----------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDP 167 SV+ LRG EG + + + + Sbjct: 124 SRNVSLASQKAHEIQSLIRLLFQAESVDSLRGYEGRGSALYFEAFGRGFIENQGF--FRR 181 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKF 225 D +N +S + L AA+ AG P GF+H S V D+ + + Sbjct: 182 VRRPPTDPVNSVLSLLYTFLMNRVYAAVRVAGLDPYPGFLHALDYGRYSLVLDLMEEFRT 241 Query: 226 DTVVPKAFEIARRNPGEPDRE-VRLACRDIFRSSKTLAKLIPLIED---VLAAGEIQPPA 281 + + + R+ ++ + D ++A G Sbjct: 242 IIADTLTLSLFNLKILKREDFYFEEPVREEDFVQDPGERIADVSRDPIGLIANGHGDSEL 301 Query: 282 PPEDAQPVAIPLPVS 296 Q +A +PV Sbjct: 302 LDLPEQRMAGEMPVE 316 >UniRef50_B9LX94 CRISPR-associated protein Cas1 n=2 Tax=Halobacteriaceae RepID=B9LX94_HALLT Length = 331 Score = 173 bits (440), Expect = 5e-42, Method: Composition-based stats. Identities = 41/292 (14%), Positives = 80/292 (27%), Gaps = 36/292 (12%) Query: 16 SMIFL--QYGQIDVIDGAFVLID---KTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 S++++ Q Q+ G + D G P + I + G S V A + Sbjct: 11 SVVYVTKQGSQVGTEGGRITVWDVDGDEGELASFPTEKLDTINVFGGVNFSTPFVAEANR 70 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP- 129 G +L + + G + ++ Q L DE + + M + Sbjct: 71 HGIILNYFTQNGKYRGSFVPEKNTIAEVRRAQYDL--DETAEIDIAADMIAAKIRNARTL 128 Query: 130 --------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 + + LRG+EG + L + W Sbjct: 129 LSRKGVHGTELLKDLGVRATTVATKDGLRGVEGEAAERYFNRLDETLTDGWTFE--KRTK 186 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 D IN +S + +A+ P +G +H + S D+ + + Sbjct: 187 RPPEDHINSLLSLTYVFMKNEVLSALRQYNLDPFLGVLHADRHGRPSLALDLQEEFRPIF 246 Query: 228 VVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 + R D +D + ++ + P Sbjct: 247 CDAFVTRLVNRGVITHDEFT----QDNHLADDAFQTYCSKFDEFMQEEFTHP 294 >UniRef50_UPI000174611D CRISPR-associated protein Cas1/Cas4 n=1 Tax=Verrucomicrobium spinosum DSM 4136 RepID=UPI000174611D Length = 847 Score = 173 bits (440), Expect = 5e-42, Method: Composition-based stats. Identities = 49/332 (14%), Positives = 91/332 (27%), Gaps = 62/332 (18%) Query: 5 PLNPIPLKDRVSMIFL-QYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P + +D ++L G + V+ D + + + + + +++ Sbjct: 485 PRRLMAARDDARALYLSTPGYHVGRSGELLVVKDGASLVEEFRINDLTNVAVFGNVQITT 544 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 AV++ + L + G + + Q + A D L + R+ Sbjct: 545 QAVQVLCEKEIPLAYFSTGGWFYGLTRGHVTKNVFTRIEQFRAADDPMRCLALSRRFVAG 604 Query: 123 RFG-------------------------EPAPARRSVEQLRGIEGSRVRATYALLA---- 153 + + A RS+E L GIEG+ A Sbjct: 605 KIRNHRTLLMRLHVEPPAAVLARLKQASQDALGARSLETLLGIEGAAAALYLQHFAGMIK 664 Query: 154 ------------------------KQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 + + R P D +N +S A S L Sbjct: 665 VGAADDDDEIPGLESASATRVPEESVFTFDFTKRSRRPP----TDPVNALLSLAYSLLAK 720 Query: 190 VTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREV 247 A A G+ P +GF H + + D+ + + PD V Sbjct: 721 DCTIAAHAVGFDPYVGFYHQPRYGRPALALDLMEEFRPLVAESVVLTAINNRMLVPDHFV 780 Query: 248 RLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 R A + +S E + + P Sbjct: 781 R-AGEGVNLTSAGRKVFFQAYEQRMGSIITHP 811 >UniRef50_D0MKP4 CRISPR-associated protein Cas1 n=1 Tax=Rhodothermus marinus DSM 4252 RepID=D0MKP4_RHOM4 Length = 525 Score = 173 bits (438), Expect = 9e-42, Method: Composition-based stats. Identities = 43/318 (13%), Positives = 93/318 (29%), Gaps = 51/318 (16%) Query: 8 PIPLKDRVSMIFLQY--GQIDVIDGAFVLID----KTGIRTHIPVGSVACIMLEPGTRVS 61 +P + +++ + V+ + +P V ++L +++ Sbjct: 178 IMPPVRQARTLYVDEIGAVVRRKGRQLVVTVSRDGRRQELLRVPALLVDQVVLVGPVQIT 237 Query: 62 HAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE 121 A+R+ + +V++ G L Q + D + L + R Sbjct: 238 SQALRMLLRRNVDIVYLSGEGRFEGRLAAEFHPHVALRLAQYEAFRDPERTLTLARLFVR 297 Query: 122 LRFG-----------------------------EPAPARRSVEQLRGIEGSRVRATYALL 152 + E ++E LRG+EG+ R +++ Sbjct: 298 GKLQNMAGLLRRYADEYGSASLRAAASEINRDLERLEQVTTLEALRGVEGTASRRYFSVF 357 Query: 153 AKQY---------GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPA 203 + + GR P D +N + + L G AA AG P Sbjct: 358 GEMLRAEAYAPTGWPAFPGRHRRPP----TDPVNATLGYLYALLLGNVVAACALAGLDPY 413 Query: 204 IGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTL 261 +G++H S D+ + + A + R P + ++ + Sbjct: 414 VGYLHAPAYGRPSLALDLMEEFRAPAADRLALRLFNRGRLRP-QHFEERNGGVYLNEAGR 472 Query: 262 AKLIPLIEDVLAAGEIQP 279 A ++ + P Sbjct: 473 AVVLEAWQAHRQQTSAHP 490 >UniRef50_D2LF35 CRISPR-associated protein Cas1 n=1 Tax=Rhodomicrobium vannielii ATCC 17100 RepID=D2LF35_RHOVA Length = 366 Score = 172 bits (437), Expect = 1e-41, Method: Composition-based stats. Identities = 48/283 (16%), Positives = 80/283 (28%), Gaps = 37/283 (13%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + V + ++ P+ V+ + + RV+ A + G +VW G G Sbjct: 50 AVVRVNNSTLLVERPGEPVFERPIELVSTLHIHGWARVTGACIGRLTAQGATVVWRGLHG 109 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA-------------- 128 V + GA D Q A E L + R + + Sbjct: 110 YPVALAQPMHGAGLDIRRAQYFEAAGERG-LAIARALISAKIQNMRGLVRRRANIEGRDC 168 Query: 129 ----------PARRSVEQLRGIEGSRVRATYALLAKQY-----GVTWNGRRYDPKDWEKG 173 S E L GIEGS ++ + V + R P Sbjct: 169 LTALAALAKKAKHASRESLLGIEGSATAFYFSAWPHMFAARAGDVEFEVRSRRPPQ---- 224 Query: 174 DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPK 231 + +N +S A + L A+ A G P +G H + S D+ + + Sbjct: 225 NAVNATLSYAYAVLSAECVCALAAVGLDPRLGVFHQPRSGRASLALDLMEPFRPLIADQA 284 Query: 232 AFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 + A +I L+E L Sbjct: 285 VLTGFNTGQIRTG-DAAEADDGWRLGETGKRTVIDLMEKRLTT 326 >UniRef50_C7NA04 CRISPR-associated protein Cas1 n=1 Tax=Leptotrichia buccalis C-1013-b RepID=C7NA04_LEPBD Length = 323 Score = 171 bits (434), Expect = 2e-41, Method: Composition-based stats. Identities = 36/297 (12%), Positives = 85/297 (28%), Gaps = 43/297 (14%) Query: 18 IFLQY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 +++ G + I + + + + ++L G ++ ++LA +V Sbjct: 3 LYITDLGTVVKKRDDLFEITTSEKKVAVAPQKIKSLVLSKGIFLTTDVIKLAVDNNIDIV 62 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVE- 135 V + G Q + + + +++ +++ + A ++ Sbjct: 63 IVDDFGNPYGRFWQSKFGSTANIRRKQLEIFGTQKGIELAKQILIQKIKNCAEHLEDLKI 122 Query: 136 -----------------------------------QLRGIEGSRVRATYALLAKQYGVTW 160 L G EG+ + Y L++ + Sbjct: 123 KREAKKAFLDKQIKEMKRYIYQIKLVEGNVSEKRGTLMGYEGNAAKIYYQTLSELIPEGF 182 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 D N ++ A LY E A + AG P +G +HT S V+D Sbjct: 183 KFE--KRSMHPAEDEFNAMLNYAFGILYSKVEKACIIAGLDPYVGIIHTDNYGKKSLVFD 240 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + + + F + + + + + L+ + L Sbjct: 241 LIESYR-HLASRTVFSLFTQKRVQ-KYFFKREGNSVMLVGDGKKVLVESFYNRLEKK 295 >UniRef50_A1BI46 CRISPR-associated protein Cas1 n=2 Tax=Chlorobiaceae RepID=A1BI46_CHLPD Length = 347 Score = 171 bits (434), Expect = 2e-41, Method: Composition-based stats. Identities = 45/308 (14%), Positives = 91/308 (29%), Gaps = 50/308 (16%) Query: 17 MIFLQY--GQIDVIDGAFVLIDKTG----IRTHIPVGSVACIMLEPGTRVSHAAVRLAAQ 70 +++Q + V +G F + + V I+L ++ AA+R Sbjct: 10 TLYIQEQGSMLRVENGCFRVTCGHDDDVAELLEVQSIKVGQIVLFGACMITPAAIRHCLM 69 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR------- 123 +V + + G D Q + + +E L+ R + + Sbjct: 70 NRIPVVLLSQHGEYFTRLESTDDVNIDLERLQFQRSAEESFPLECSRTIVRAKLHNSGVL 129 Query: 124 ----------------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY---GV 158 E S++ +RG EGS + + + G Sbjct: 130 LRRHAESSGSEALRHAATQLRQLEEHVDRADSIDAVRGYEGSGAATYFGVFEDFFDTGGF 189 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP--LSFV 216 + R P D +N +S S L+ + P +GF+H KP + V Sbjct: 190 IFRERVKRPP----TDPVNAMLSFGYSLLFNNIFSMARLHRLHPYVGFLHADKPAHPALV 245 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREV-----RLACRDIFRSSKTLAKLIPLIEDV 271 D+ + + V + + P+ + + S + E++ Sbjct: 246 SDLIEEFRT-LVDGLVIALINKRLISPEEFTVARHDDGKPKGCYLSDGARKTFLREFENL 304 Query: 272 LAAGEIQP 279 + P Sbjct: 305 MHRTTTHP 312 >UniRef50_A9AX66 CRISPR-associated protein Cas1 n=1 Tax=Herpetosiphon aurantiacus ATCC 23779 RepID=A9AX66_HERA2 Length = 350 Score = 171 bits (434), Expect = 2e-41, Method: Composition-based stats. Identities = 35/317 (11%), Positives = 83/317 (26%), Gaps = 55/317 (17%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGI-------RTHIPVGSVACIMLEPGTRVSHAAV 65 + ++L QY + A + +P+ + ++++ ++ +A+ Sbjct: 1 MQTLYLSEQYSIVKREGEALRVEIPEDQQLGRQRQVVRVPLNVIERVVVQGEITLTASAL 60 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF----- 120 + ++ +G A + L Q R + R Sbjct: 61 ACLLERRICTHFLSYSGRSQGALTPDPTRNASLRLAQYAAHTSIQHRFSLARTFVDGKLR 120 Query: 121 -----------------------ELRFGEPAPARRSVEQ-------------LRGIEGSR 144 LR S+ + + G EG Sbjct: 121 NLRTQILRFNRSQREPTLTQAIERLRDAHRDLHGLSIPEYVDPLDRMHGMGQILGCEGQG 180 Query: 145 VRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAI 204 A + W + + D +N +S L + + G+ P I Sbjct: 181 SAAYWDCWGMLLNQPWE--WHGRRRRPPPDPVNALLSYGYVILTSQVLSQLAIVGFDPYI 238 Query: 205 GFVHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLA 262 GF+H + D+ + + V + + + + S + Sbjct: 239 GFLHQSSFGKPALALDLMEEFRPVIVDSVVLTVLNTKILN-QQHFQREPGSVQLSKEGRK 297 Query: 263 KLIPLIEDVLAAGEIQP 279 + +E+ ++ P Sbjct: 298 LFLTKLEERFSSEIQHP 314 >UniRef50_Q2J7N9 CRISPR-associated protein Cas1 n=1 Tax=Frankia sp. CcI3 RepID=Q2J7N9_FRASC Length = 344 Score = 171 bits (433), Expect = 3e-41, Method: Composition-based stats. Identities = 41/303 (13%), Positives = 81/303 (26%), Gaps = 47/303 (15%) Query: 11 LKDRVSMIF-LQYGQ-IDVIDGAFVLI--DKTGIRTHIPVGSVACIMLEPGTRVSHAAVR 66 + + ++ ++ G + + A + D R +P+ V I++ G ++ ++ Sbjct: 1 MAELLNTLYATTPGTSLHLDGDAVRIWHPDNDKGRRLLPLVRVDHIVVFGGVTITDDLLQ 60 Query: 67 LAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 A + W+ G P G + Q D++ RL + + Sbjct: 61 RCATDRRSVTWLTGNGRFRARVEGPTGGNPHLRIAQHDHFRDDERRLTLAMSYIAGKLQN 120 Query: 127 PAP--------------------------------ARRSVEQLRGIEGSRVRATYALLAK 154 +V + G+EG R A Sbjct: 121 SRQLLLRAARDATGTRQTALRDTAAHLADALPTLRDTTNVAEAMGVEGQAARRYIATWPH 180 Query: 155 QYGVTWN-----GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 GR P D +N +S L A+ G P IG++H Sbjct: 181 LLTPHATVTAPAGRTSRP----ATDPVNAALSFGYGILRIAVHGALDHVGLDPHIGYLHG 236 Query: 210 GKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 +P + D+ + + V F + + + L Sbjct: 237 IRPGKPALALDLMEEFRALLVDRLVFTAFNQRQLTDADFEHHPGGSCQLTESGRKNYLTL 296 Query: 268 IED 270 Sbjct: 297 WSQ 299 >UniRef50_A9GDF7 Putative uncharacterized protein n=1 Tax=Sorangium cellulosum 'So ce 56' RepID=A9GDF7_SORC5 Length = 365 Score = 170 bits (432), Expect = 4e-41, Method: Composition-based stats. Identities = 54/304 (17%), Positives = 101/304 (33%), Gaps = 38/304 (12%) Query: 5 PLNPIPL-KDRVSMIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSH 62 P+ P +R S+ + +G +I AF + ++ G RT I V+ I++ +++ Sbjct: 19 PVRLYPPDTERRSLHVVSHGARIGRASDAFEVTEREGERTRIGAREVSDIVVHGHAQITT 78 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 A+RL A + +VG G V G + + + Q + D+ + + R++ Sbjct: 79 QALRLCAAEEIAVHFVGAGGAHVGVFS-GGSSGVQRRIRQFRGLTDDVFAIGLARRLVMA 137 Query: 123 RFGEPAPA-------------------------------RRSVEQLRGIEGSRVRATYAL 151 + L G EG+ R + Sbjct: 138 KIEMQLRHVLRASRKDEALRAGLDEPIESLRGGLRRAAKAADRTSLMGQEGNAARGYFGA 197 Query: 152 LAKQYGVTWNG--RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT 209 LA R D N +S + LY +AIL G P G +H Sbjct: 198 LAALVHPDAGDALRPRGRSRRPPEDRFNALLSFGYTLLYRDVLSAILRVGLEPGFGVLHQ 257 Query: 210 GKPLS--FVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPL 267 + + D+ ++ + V R + +R+ + R ++ S I + Sbjct: 258 PRSAAFPLALDLTELFRVPVVDMAVLGAVNRRTFDAERDFVITGRQVWLSDAGHRAFIEV 317 Query: 268 IEDV 271 E Sbjct: 318 YERR 321 >UniRef50_B3EG05 CRISPR-associated protein Cas1 n=11 Tax=Bacteria RepID=B3EG05_CHLL2 Length = 384 Score = 170 bits (432), Expect = 4e-41, Method: Composition-based stats. Identities = 37/334 (11%), Positives = 92/334 (27%), Gaps = 68/334 (20%) Query: 8 PIPLKDRVSMI-FLQYGQIDVIDGAFVLIDKTGIRTH-----------------IPVGSV 49 + ++ +S + ++ +I V + ++ + IP+ ++ Sbjct: 19 LLRKENTISFVPYVTQDEITVETNPSLYLEPDEEEAYSLNPVKDEHLNTAARRVIPINNI 78 Query: 50 ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALD 108 + + + + L G + + ++ Q K Sbjct: 79 DSFFVFGEVSFNTKFLNFLTRNRIPLHLFNYYGFYSGSYYPREHLLSGYLVVNQVKHYSS 138 Query: 109 EDLRLKVVRKMFELR-------------------------------------FGEPAPAR 131 RL++ R+ A Sbjct: 139 TKKRLEIAREFIGAAAANIIRNLKYYTADSRQGVQDDESLAMLFHTIAQIESLANGIAAA 198 Query: 132 RSVEQLRGIEGSRVRATYALLAKQYG-----VTWNGRRYDPKDWEKGDTINQCISAATSC 186 + + L G+EG+ + Y + + +++ R P D + +N +S S Sbjct: 199 QDIPSLMGVEGNIRKVYYQVWQQLLRSADPAFSFSERVKRPPD----NAVNALVSFGNSL 254 Query: 187 LYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 +Y I P + F+H + S D+A++ K + F++ + Sbjct: 255 MYSACLTEIYRTQLNPTVSFLHEPSERRFSLALDMAEVFKPMFIDRLIFKLVNTRAIQA- 313 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 R A + ++ E+ + Sbjct: 314 RHFTTALNFCHLNDAGRKIVVKEFEERMRTTIKH 347 >UniRef50_Q0ADY5 CRISPR-associated protein, Cas1 family n=2 Tax=Nitrosomonas RepID=Q0ADY5_NITEC Length = 328 Score = 170 bits (430), Expect = 7e-41, Method: Composition-based stats. Identities = 42/291 (14%), Positives = 90/291 (30%), Gaps = 43/291 (14%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ +F+ +++ GA V + +P+ + + L ++ A + + G Sbjct: 1 MTSLFVDRRGVVLELESGAIVFRENGERIGTVPIAPLTRVFLRGDVKLPAALLGKLGEQG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP--- 129 +V + R + + + Q +L+ D+ L++ + + E + Sbjct: 61 VGVVILSGRIGRPSLLLARPHNDAARRVVQIRLSFDKPFCLQIAKALIERKLTRQIEWFA 120 Query: 130 ----------------------------ARRSVEQLRGIEGSRVRATYALLAKQYG--VT 159 S LRG+EGS ++ L + Sbjct: 121 ELRENDMQVRYELSHALRALEEHRSRIGHVSSAASLRGVEGSAAARYFSGLQAVVPDSLH 180 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVY 217 ++GR P D N +S + L+ A+ G+ P +GF H S Sbjct: 181 FSGRNRRPPR----DPFNALLSLTYTLLHSEIAIALYGTGFDPYVGFYHRLAFGRESLAS 236 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 D+ + ++ A + R+ E D + Sbjct: 237 DLLEPLRP-LADQFALALIRKKVLEKD-HFSTTEAGCLLGKAGRTRYYAAY 285 >UniRef50_Q2FL78 CRISPR-associated protein, Cas1 family n=1 Tax=Methanospirillum hungatei JF-1 RepID=Q2FL78_METHJ Length = 331 Score = 169 bits (428), Expect = 1e-40, Method: Composition-based stats. Identities = 51/308 (16%), Positives = 98/308 (31%), Gaps = 40/308 (12%) Query: 15 VSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSV-----ACIMLEPGTRVSHAAVRL 67 ++ + + +I + T P ++ + + +S AAVRL Sbjct: 1 MNSVLITGAGYRIRKRGDVLTIETGKDSDTAEPPRTLSPLGLDLLAIAGDHSISTAAVRL 60 Query: 68 AAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------- 119 G + + G + P G + Y+A+ + E+ RL++ R + Sbjct: 61 VTSHGGAIALMDGLGNP-FGHFLPLGRSALIEQYEAQASAPEERRLEIARSICTGALENK 119 Query: 120 ----------------FELRF----GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVT 159 E+R + A +S++ LRG+EGS A + + + Sbjct: 120 RTLLSNLERIRGFDLSREIRLVEDAQDKALECQSLDSLRGVEGSGAHAYFQGFSLAFDEE 179 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVY 217 W D +N +S LY A++ +GY+P G H K + VY Sbjct: 180 WGFLG--RSQNPATDPVNSLLSYGYGMLYIQARQALVLSGYSPYYGAYHETYKKQEALVY 237 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 D+ + + V ++ PD + K + + Sbjct: 238 DLVEEFRQPVVDRTVVTFLAKHMATPDDFTYPDEGGCMIGTMAKKKYAAAVLTRIHGKVK 297 Query: 278 QPPAPPED 285 +D Sbjct: 298 YEEQTFQD 305 >UniRef50_B8CYA1 CRISPR-associated protein Cas1 n=2 Tax=cellular organisms RepID=B8CYA1_HALOH Length = 325 Score = 169 bits (428), Expect = 1e-40, Method: Composition-based stats. Identities = 53/294 (18%), Positives = 97/294 (32%), Gaps = 48/294 (16%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + V +F + + + + V+ I++ G +S AV+LA + + ++ E G Sbjct: 10 SYLHVKQKSFEIKTEEDKK-RVSAKKVSSILITTGAAISTDAVKLALENNIEIQFLDEFG 68 Query: 83 VRVYASGQPGGARSDKLLY-QAKLALDED-------------------------LRLKVV 116 + P + + Q +LA E+ R K Sbjct: 69 CSLGKVWHPKLGSTTYIRRKQLELAESEEGTELVKEFMLDKIDNMINHLHDLAIKRSKSK 128 Query: 117 RKMFELRFGEPAPARRSVE-----------QLRGIEGSRVRATYALLAKQYG--VTWNGR 163 K + E R +E + G EG+ R +A L+ +NGR Sbjct: 129 EKYINKKIKEICELRNKLEKVTGYIEDVRNTIMGYEGNISRKYFASLSFLLPDRYKFNGR 188 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 + P D N ++ LYG E A++ AG P +G +HT SFV+D + Sbjct: 189 SFRP----AEDEFNCLLNYGYGVLYGKVEKALIIAGLDPYVGILHTDGYNKKSFVFDFIE 244 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 + ++ R + + + L+ + D Sbjct: 245 PYRHHI-DRVVMKLFSRKKIR-KLHFDKIQGGLTLNDEGKKLLLTELNDYFDKK 296 >UniRef50_A0LHZ4 CRISPR-associated protein, Cas1 family n=2 Tax=Deltaproteobacteria RepID=A0LHZ4_SYNFM Length = 350 Score = 168 bits (427), Expect = 1e-40, Method: Composition-based stats. Identities = 48/315 (15%), Positives = 90/315 (28%), Gaps = 54/315 (17%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 + I Q + V+ I IP+ ++ + L + +S A + + + Sbjct: 4 TYILEQGAYLRKAGNHLVVTKNREIIAEIPLEGLSQLTLVGFSSLSGAVLEVLIRHRIET 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE--------- 126 V + G + Q D L+ + + + Sbjct: 64 VLLSPRGQFRARLMVDEHKHVQRRQGQYVKLSGADFALRTTQSIVRGKLRNTARFLALRG 123 Query: 127 --------------------PAPARRSVEQLRGIEGSRVRATYALL---AKQYGVTWNGR 163 ++ ++ LRGIEG + + + G +NGR Sbjct: 124 SRYGSEALHRAAAQIKGLSALVDRQKDMDLLRGIEGHAANLYFEVFPLLVRVPGFEFNGR 183 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVYDIA 220 P D +N +S + L AI G P +G +H G+P S D+ Sbjct: 184 NRRPP----LDPLNALLSFVYTLLTQEVLTAIKVVGLDPYLGCLHAVDYGRP-SLACDLV 238 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRL--ACRDIFRSSKTLAK------------LIP 266 + + + R D V C D + + I Sbjct: 239 EEWRTFLGDRLVLALVNRRVIGLDDFVYRPTPCADAVDEEELKHRRPVEMKPKIARAFIE 298 Query: 267 LIEDVLAAGEIQPPA 281 E +A+ + P + Sbjct: 299 AYEKWMASRILDPGS 313 >UniRef50_B0TFX3 Crispr-associated protein cas1 n=4 Tax=Clostridia RepID=B0TFX3_HELMI Length = 364 Score = 168 bits (427), Expect = 1e-40, Method: Composition-based stats. Identities = 49/300 (16%), Positives = 94/300 (31%), Gaps = 57/300 (19%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 + G V+ +K + T +P V + ++ G +S A+ + G + + Sbjct: 39 ASLGKKSGRLVVREKGQVVTEVPFDRVEQVTVITSGASLSTDAIEECVRHGIEINLLDFR 98 Query: 82 GVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA----------- 130 G PG + K + LA ++ + + + + Sbjct: 99 GSPYAKLFAPGLTATVKTRREQLLAFNDQRSIFLAKAFVRGKIQNQINTLKYFAKYRKSA 158 Query: 131 ---------------RRSVEQLRGI---------------EGSRVRATYALLAKQYG--V 158 +++++L GI EG + +A V Sbjct: 159 RQEVYAYLQDAALLMEKNLQELTGIDGLNIDAVRGPLMSVEGRAATRYWDAVAFLLKGYV 218 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFV 216 + GR + D +N ++ + L A+L AG P GF+H +P S V Sbjct: 219 VFPGRE----NRGATDPVNSLLNYGYAVLEARVLGAVLQAGLDPYAGFLHVDRPGKTSLV 274 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 YD + + + + V + + T L+ I+ LAA E Sbjct: 275 YDFIEEFRQPVIDRPVLAMVIAG-------VEIGMEGERLADGTKRDLLERIQQRLAATE 327 >UniRef50_C8WTR3 CRISPR-associated protein Cas1 n=2 Tax=Alicyclobacillus acidocaldarius RepID=C8WTR3_ALIAD Length = 346 Score = 168 bits (426), Expect = 2e-40, Method: Composition-based stats. Identities = 47/306 (15%), Positives = 94/306 (30%), Gaps = 47/306 (15%) Query: 17 MIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 +F+Q + V V+ + +P+ V I+ + + A G Sbjct: 8 TLFVQREGAIVRVHQDTVVVTLENETLLRVPMHMVDSIVGIGRVSFTSPLLERCAAEGRS 67 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP------- 127 +V + G +Y Q + A + L ++R + + Sbjct: 68 VVRMTRGGRFLYRIEGRMSGNVLLRTAQHEAARSPERSLTIMRAIVAGKVHNQRQLVLKA 127 Query: 128 --------------------------APARRSVEQLRGIEGSRVRATYALLAKQYG---- 157 P+ +++RG+EG+ R + L Sbjct: 128 ARDLTAPADRSFVREVAGDLGRELRKLPSASHPDEIRGVEGASARRYFMALRHLIAPAIR 187 Query: 158 --VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL-- 213 ++++GR P D +N +S + + E+A+L G P IGF+HT +P Sbjct: 188 DALSFDGRNRRPPR----DPVNAVLSFLYALITRDAESALLGVGLDPQIGFLHTLRPGRP 243 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 S D+ + ++ + R +P L + + L + Sbjct: 244 SLALDLVEEMRPILADRVMLSLFNRRQLQPSDFEVLPGGAVELTDSGRRTLFAEWDRRKQ 303 Query: 274 AGEIQP 279 P Sbjct: 304 VEIEHP 309 >UniRef50_A7GY67 Crispr-associated protein Cas1 n=6 Tax=Campylobacter RepID=A7GY67_CAMC5 Length = 332 Score = 166 bits (421), Expect = 7e-40, Method: Composition-based stats. Identities = 51/295 (17%), Positives = 88/295 (29%), Gaps = 34/295 (11%) Query: 12 KDRVSMIFLQYGQIDVIDGAFVLIDKTGI-----RTHIPVGSVACIMLEPGTRVSHAAVR 66 DR I L G++ D +P+ ++ I + + + Sbjct: 4 SDRTHFI-LSSGRLRRQDNNIYFDKFDETGGVTASKILPINAIDEIYILTRVELDTYTLA 62 Query: 67 LAAQVGTLLV----WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 A LL + G ++ LL Q + D R+ + R++ Sbjct: 63 FLADNNILLHVFSPFQSFRGNFYPSTSNSVNKSGFALLSQLRAFDDPVKRVYIAREITRA 122 Query: 123 RFGEP-------------------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGR 163 A V Q+ EG+ + Y + + + Sbjct: 123 HMLNDAANCKKHGVKFDPAPHIAALDAAADVGQIMAAEGAFQKLYYEKWNEIIADQRSFK 182 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 D IN IS + +Y V + I P IGF+H + LS D+A+ Sbjct: 183 FTVRSKRPPADKINSFISYVNTRIYNVCLSEIYKTELDPRIGFLHEPNYRALSLHLDLAE 242 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS---KTLAKLIPLIEDVLA 273 I K F + + A R F + K K+I + + +A Sbjct: 243 IFKPILGDTLIFAMLNKKEITAKDFQTDAGRIKFSNDAIQKIEMKMISRLSETIA 297 >UniRef50_Q96X75 Putative uncharacterized protein ST2634 n=1 Tax=Sulfolobus tokodaii RepID=Q96X75_SULTO Length = 317 Score = 165 bits (419), Expect = 1e-39, Method: Composition-based stats. Identities = 46/276 (16%), Positives = 92/276 (33%), Gaps = 34/276 (12%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ FV+ K G + ++ V I++ G V+ A+RLA G ++++ G Sbjct: 25 KLSTKGKTFVISKKDGKKVNVSPAEVDQIVIMTSGVTVTSKAIRLALDHGIDIIFLDSRG 84 Query: 83 VRVYASGQPGG-ARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIE 141 + Q L + ++ R++ + + A + + GIE Sbjct: 85 NPFGRLFHSEPIKTVETRKAQYLAILKGEE--EIPREIIKSKIKNQANHIKFWFKKLGIE 142 Query: 142 GS--------------RVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCL 187 G+ R + L + + GR D E D N + A + L Sbjct: 143 GNDYKLIEGKDDDEATAARYYWHALGRI--IPMKGR-----DPESTDPFNVSFNYAYAIL 195 Query: 188 YGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDR 245 Y + + G P GF+H + S VYD +++ K V + Sbjct: 196 YSNIQRVLQLVGLDPYAGFIHKDRSGKPSLVYDFSEMFKPVLVDYPLVSLFINGFIPN-- 253 Query: 246 EVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 +D +++ K+ + + + Sbjct: 254 -----VKDGILDAESRKKIAEAVINAMNGKVKDEGG 284 >UniRef50_A8UXX8 Putative uncharacterized protein n=1 Tax=Hydrogenivirga sp. 128-5-R1-1 RepID=A8UXX8_9AQUI Length = 290 Score = 165 bits (418), Expect = 2e-39, Method: Composition-based stats. Identities = 45/277 (16%), Positives = 100/277 (36%), Gaps = 30/277 (10%) Query: 13 DRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 R+ +++ ++ + + I + +PV V I+ G +S A+ L Q Sbjct: 2 KRIVVVY-SSARVSRSGERVKISTFS-INSSLPVRYVEAIVAFGGLELSSHALSLLMQNN 59 Query: 73 TLLVWVGEAGVRVYASG-QPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP---- 127 + ++ + G + + + + Q + + + V +++ + Sbjct: 60 VPVFFLTKLGALKAVLWTKILSSNTSNRIRQYEKYVRDPF--GVAKEIVRAKIRTIEREF 117 Query: 128 ----------APARRSVEQLRGIEGSRVRATYALLAKQY---GVTWNGRRYDPKDWEKGD 174 + E+L GIEG+ R + ++ G ++ R Y P D Sbjct: 118 GLKLNNLISSLERAGTKEELLGIEGTASRLMFERFSQNIELSGFSFRERAYHPPP----D 173 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKA 232 +N +S + + Y ++ GY P I F+HT G L+ DI + ++ Sbjct: 174 PVNALLSLSYTFTYALSLPLTTLMGYDPYISFLHTRSGSHLALCSDIMEPVRPVLTKRLE 233 Query: 233 FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIE 269 I RR + ++ + +++ K + E Sbjct: 234 EPILRRVFTK--KDFNRERAACYLKKESMPKFLNWFE 268 >UniRef50_D0W646 CRISPR-associated protein Cas1 n=1 Tax=Neisseria cinerea ATCC 14685 RepID=D0W646_NEICI Length = 325 Score = 165 bits (418), Expect = 2e-39, Method: Composition-based stats. Identities = 37/293 (12%), Positives = 83/293 (28%), Gaps = 43/293 (14%) Query: 15 VSMIFLQYGQIDV--IDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ +++ + + + V + IP+ + C+ + +S + + Sbjct: 1 MTTLYIDRKNLTLRADGDSLVCYENGERTATIPLKVLQCVCIRGDLTLSAKVLGKLGEAD 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL---------- 122 ++ + R + + Q + D+ L R Sbjct: 61 IGVLVLNGRLKRPALMLPNLKLDGSRRVAQYAFSQDKAACLAAARNTVSAKLSAQQQHLQ 120 Query: 123 ---------------------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VT 159 R + P + +LRGIEG+ + A + Sbjct: 121 QMMPSDGTAADCLNKHIKAISRLADTVPDCNGIARLRGIEGAAAAQYFGAWAAVLPETLH 180 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 + GR P D +N +S + ++ + +G P IG+ H + S Sbjct: 181 FTGRNRRPPR----DPVNAALSLTYTLMHFEIVKHLHLSGLDPFIGYYHLPEHGRESLAC 236 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 D+ + ++ + R P + + K P E Sbjct: 237 DLTEHLRCR-CDEWVLSLFARQILTP-SDFSTNPQGCLMGKNARIKFYPAYEQ 287 >UniRef50_A4FXZ8 CRISPR-associated protein, Cas1 family n=9 Tax=cellular organisms RepID=A4FXZ8_METM5 Length = 342 Score = 165 bits (418), Expect = 2e-39, Method: Composition-based stats. Identities = 50/296 (16%), Positives = 94/296 (31%), Gaps = 50/296 (16%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 I F L IP V I++ + +S A+ LA + LV + G Sbjct: 27 YISKSGTRFKL-KNGENVQEIPAKKVEQILITCPSSISTEAISLAVEENIDLVLLKMNGK 85 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRL-----KVVRKMFE----------------- 121 + + + L + +L + + +KM Sbjct: 86 PIGRFWHSKHGSISTIRKKQLLLSENELGITFVKEWISKKMENQIDFLKMLSMNRRDERR 145 Query: 122 -------LRFGEPAPA------RRSVEQLR----GIEGSRVRATYALLAKQYG--VTWNG 162 L+ E ++++++R G EG R + +++K +NG Sbjct: 146 ELLKENVLKIDEEIKKLDNVTFNQNIDEIRNTVQGYEGYASRVYFEMISKSLPEKYQFNG 205 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIA 220 R +P D N ++ LY E + + AG P IG +H +F +D+ Sbjct: 206 RSRNPAK----DYFNCMLNYGYGILYSQIERSCIIAGLDPYIGILHVDNYNRKAFTFDLI 261 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 + + V F++ D F + LI ++ + E Sbjct: 262 EKYR-IYVDKTIFKMFSTKKIR-DDFFEEIEGGFFLAKDGKQALISEYNLLMESTE 315 >UniRef50_B5YJS2 Crispr-associated protein Cas1 n=1 Tax=Thermodesulfovibrio yellowstonii DSM 11347 RepID=B5YJS2_THEYD Length = 318 Score = 164 bits (416), Expect = 3e-39, Method: Composition-based stats. Identities = 42/294 (14%), Positives = 92/294 (31%), Gaps = 43/294 (14%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +F+ +I V + K +P+ + +++ ++ + + G Sbjct: 1 MSTVFIDRKDIEIRVDGNSISFYAKGKKDGSLPLSPLKRVVIVGNVKIETSVLYKLVNHG 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF-------- 124 ++++ P + Q + +L LK +++ + + Sbjct: 61 ITVLFLTGKLKYSGILNGPLHNNGLLRVKQYQKSLS-GFSLKFAKELIKRKIVSQRDFLS 119 Query: 125 ------------------------GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTW 160 S++ LRGIEG+ + +K + + Sbjct: 120 EIREIKKALAMQADRAIEILNKAISNIEVTPISIDSLRGIEGAASSIYFITYSKIFPNSL 179 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT---GKPLSFVY 217 D +N +S + L+ I G P IGF H G+ S Sbjct: 180 KF--VRRIKRPPKDPVNAMLSLCYTLLHYEIVREIQLIGLDPTIGFYHQFEYGRE-SLAC 236 Query: 218 DIADIIKFDTVVPKAFEIA-RRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 D+ ++ + + V +E+ ++ G D ++ K PL E Sbjct: 237 DLVELFRVN-VDRFVYELFKAKHLGNRDFMKDEESGGVYLKKTGRKKFYPLYEQ 289 >UniRef50_B0VHC1 Putative uncharacterized protein n=1 Tax=Candidatus Cloacamonas acidaminovorans RepID=B0VHC1_9BACT Length = 344 Score = 164 bits (415), Expect = 4e-39, Method: Composition-based stats. Identities = 33/310 (10%), Positives = 79/310 (25%), Gaps = 54/310 (17%) Query: 15 VSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + ++++ Q I + + + ++L ++ V + Sbjct: 1 MPVVYIADQGSHICKKGDRLYVYRGNQLLRWFHTKDIIQLILVGNIGLTSQVVTYLLKNR 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARR 132 V++ G G + Q + + + R+ + + + Sbjct: 61 IDTVFLSYYGKFKGRLVGEFGKNVMLRVNQFQYLNEPENRVYLANLIIRGKIENSLLHLA 120 Query: 133 SV------------------------------EQLRGIEGSRVRATYALLAKQY---GVT 159 + L G EG + +A Sbjct: 121 KRKKRNKAESLTFAYIKNKALLAQLKTQILPKDILLGYEGIAAKNYFAAFPDLIANPDFP 180 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVY 217 ++GR P D +N +S + + L A G P G +H S V Sbjct: 181 FSGRNKRPPK----DEVNAMLSLSYTFLMNQVMCAAYICGLDPYYGALHDLDYGRQSLVL 236 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRD------------IFRSSKTLAKLI 265 DI + + + + R + + D + + + +I Sbjct: 237 DIMEEFRP-LIDNMVISLINRKEIRLEHFLYNTLPDDENSDLDNSTLPVSLTKNGMKIII 295 Query: 266 PLIEDVLAAG 275 ++ + Sbjct: 296 TAFSKLINSK 305 >UniRef50_A7NP58 CRISPR-associated protein Cas1 n=6 Tax=Chloroflexi (class) RepID=A7NP58_ROSCS Length = 338 Score = 163 bits (412), Expect = 9e-39, Method: Composition-based stats. Identities = 45/302 (14%), Positives = 87/302 (28%), Gaps = 57/302 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGE-AG 82 I G ++ + + +P+ + I++ G +S VR A+ G + ++ G Sbjct: 12 ISKHQGRIRVMKEKERLSEVPIMHLEQILICSDGVGLSSDVVRACAEEGIPIHFLNSANG 71 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM-------------FELRFGEPAP 129 G A D++ L++ + + + + Sbjct: 72 GDYGTFVHSGITGMALTRRAQLRAGDDERGLRLAQAFASGKIQSQANMLRYAAKNRKEND 131 Query: 130 ARR----------------------------SVEQLRGIEGSRVRATYALLAKQYGVT-- 159 + L G EG + +A+ Sbjct: 132 PDLHNDLMRTATEILDALPPLRAVRGVLTDETRAALMGFEGMAGARYWTAVARIIPDDLG 191 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 W GR D NQ ++ L A++ AG P GF+H +P S Sbjct: 192 WPGRETR----GARDRFNQALNYGYGVLQSQVRTALILAGLDPNAGFLHADRPGKPSLTL 247 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 D+ + + V + R R+ D T ++ I + L + E+ Sbjct: 248 DLIEEFRQAVVDRTLIGLVNRQFEIVQRD------DGLLDEDTRKRIAEKILERLNSTEL 301 Query: 278 QP 279 Sbjct: 302 YE 303 >UniRef50_A7BYC5 Protein containing DUF48 n=1 Tax=Beggiatoa sp. PS RepID=A7BYC5_9GAMM Length = 322 Score = 163 bits (412), Expect = 9e-39, Method: Composition-based stats. Identities = 45/308 (14%), Positives = 89/308 (28%), Gaps = 39/308 (12%) Query: 15 VSMIFLQY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 + +F+ + + + G P+ + ++L + ++ + L Q G Sbjct: 1 METLFISRDATLKRRENTLAVTV-GGKTKPFPIEKIRHLVLLGESSLNTKLLTLCGQNGV 59 Query: 74 LLVWVGEAGVRVYASGQPG-GARSDKLLYQAKLALDEDLRLKVVR--------------K 118 L G A L QAKL LD++ R+ + R Sbjct: 60 RLSIFDYYGYFKGAFEPIEQNGSGRVKLAQAKLILDQEQRMAIAREIVRGAAHNMRANLA 119 Query: 119 MFELR--------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQY-GVTWNGR 163 ++ R + + ++L G EG + +A A + + R Sbjct: 120 YYQYRGNKALSKSTQEITKLMDRLHFAKDSDELMGFEGQITQTYFAAWALIDQRLDFLPR 179 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIAD 221 P + + IN IS Y VT + F+HT S D+++ Sbjct: 180 VRRPPN----NPINCLISFINQLTYTVTRHEAFKTHLEETLSFLHTPSTGRSSLSLDLSE 235 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPA 281 K ++ R+N + D + ++ L + Sbjct: 236 PFKPVLSHGLIIKMVRKNMVD-DAWFDQKPGVCLLTETGRRNVVEQFSIRLEERYEERSF 294 Query: 282 PPEDAQPV 289 + Sbjct: 295 REWLYREA 302 >UniRef50_A6UNF5 CRISPR-associated protein Cas1 n=1 Tax=Methanococcus vannielii SB RepID=A6UNF5_METVS Length = 322 Score = 163 bits (412), Expect = 9e-39, Method: Composition-based stats. Identities = 42/291 (14%), Positives = 96/291 (32%), Gaps = 48/291 (16%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 +I V+ K G+ + ++ +++ + ++ + L+ + +V + + G Sbjct: 10 SKISKSGNRIVIESKDGVFES-SIENIEQVLICAPSSITTEFLELSVKNNVDVVLLNKYG 68 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV-------- 134 V G ++ Q KL + + ++++ + ++ Sbjct: 69 KPVGRFFGNGIEKTALKRKQLKL-TENWAGISIIKEFLSEKIENQIEHLNNISKENISEG 127 Query: 135 ----------------------------EQLRGIEGSRVRATYALLAKQYG--VTWNGRR 164 +Q++G+EGS + + +L+K +NGR Sbjct: 128 SLSKSISKINKSKNMLLEIKGENIAEIRDQIQGLEGSVSKIYFRVLSKSLPKKYQFNGRS 187 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADI 222 P D N ++ LY E + +G P IG +HT SFV+D + Sbjct: 188 RKPAK----DYFNCMLNYGYGMLYSEIEKICIISGIDPTIGILHTDGQNRKSFVFDYIEK 243 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 + + + +E + + + R + +I L Sbjct: 244 YR-NIIDKIVYESFLNETIK-ENFFKEVERGYLLDEEGRKHIISKFNGYLN 292 >UniRef50_B0VIK5 Putative uncharacterized protein n=1 Tax=Candidatus Cloacamonas acidaminovorans RepID=B0VIK5_9BACT Length = 353 Score = 162 bits (411), Expect = 1e-38, Method: Composition-based stats. Identities = 49/292 (16%), Positives = 97/292 (33%), Gaps = 47/292 (16%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 + D F L + +T + ++ I++ ++ A++LA +V++ + G Sbjct: 11 SYLRKKDEMFELSIED-RKTKLSPEKISSIVISNAATITTDAIQLAMDYNIDIVFLDKYG 69 Query: 83 VRVYASGQPGGARSDKLLY-------------------------QAKLALDEDLRLKVVR 117 P + + Q + + + Sbjct: 70 SPYGRIWFPKIGSTVLIRRRQLEMLSDNVGLQFIKDWIAIKIMNQYRFVQRLISKRDCDK 129 Query: 118 KMFELRFGEPAPARRSV-------EQL----RGIEGSRVRATYALLAKQYG--VTWNGRR 164 +F+ R A + E+L G EG + +++LA+ + GR Sbjct: 130 SIFQTRMQNMQEAAICIMQAEGKLEELSGSFMGWEGGASKNYFSILAELIPDAYKFEGRS 189 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 P D N ++ LY E A++ AG P +G +H+ SFV+D + Sbjct: 190 SRPAK----DAFNAMLNYGYGILYSKVERALIIAGLDPYLGLLHSDNYNKKSFVFDFIEP 245 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 + V F I R+ P + + + S+ L PL+ + Sbjct: 246 YR-ILVDEPVFYIFSRHKFSP-YFIEPVHQGVLLSTTGKKFLAPLLLEHFDE 295 >UniRef50_B6IX22 CRISPR-associated protein Cas1, putative n=2 Tax=Rhodospirillum RepID=B6IX22_RHOCS Length = 350 Score = 162 bits (409), Expect = 2e-38, Method: Composition-based stats. Identities = 52/319 (16%), Positives = 95/319 (29%), Gaps = 55/319 (17%) Query: 15 VSMIFLQ--YGQIDVIDGAFVLIDK--------------TGIRTHIPVGSVACIMLEPGT 58 ++ +++ + G+ + + T I + ++L T Sbjct: 1 MTTLYVTVPGAIVRTESGSLSVWIEVQADDGGPDDGPIRRKRLTSIEPHRLETLVLLGQT 60 Query: 59 RVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRK 118 ++ A+RL + + G P +D L+Q L + RL R Sbjct: 61 ALTPNAMRLCMAHRITVALLDGGGNLAARVVPPEARTADLRLHQYTLHHTPEERLIRARG 120 Query: 119 MFELRFGEPAP-----------------------------ARRSVEQLRGIEGSRVRATY 149 + + A + E L GIEG+ R + Sbjct: 121 VVAAKLHNAAEVLRAVRSNQSNPDIARAIAEVERTAATVADAVTPETLLGIEGNGARQYF 180 Query: 150 ALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFV 207 A L + + + GR P D N +S L + A G P +GF Sbjct: 181 AGLRAAFVGDIRFTGRAQRPPP----DPANSMLSFGYVLLGNRIAGLLEARGIDPCLGFF 236 Query: 208 HTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACR--DIFRSSKTLAK 263 H +P S D+ + ++ V + +PD A R + + + Sbjct: 237 HALRPGRPSLALDLLEELRQPVVDRLVLRLCNLRMLKPDMFEADAERPGGVRLTVEGRRT 296 Query: 264 LIPLIEDVLAAGEIQPPAP 282 + E LA + P Sbjct: 297 FLEEWEAQLARPLREQGTP 315 >UniRef50_C0QHV1 Putative CRISPR-associated protein (Uncharacterized protein, predicted to be involved in DNA repair) n=1 Tax=Desulfobacterium autotrophicum HRM2 RepID=C0QHV1_DESAH Length = 338 Score = 160 bits (405), Expect = 6e-38, Method: Composition-based stats. Identities = 44/300 (14%), Positives = 92/300 (30%), Gaps = 48/300 (16%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 I + + F L ++ + V ++ ++ A+ LA + +V Sbjct: 4 YIHSPGTYLTQKNEIFRLKNQD-RSLDLSPRKVESFVITNQAMITTQAINLALENNIDMV 62 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV-- 134 ++ G + + + D L +V+ + + + G +++ Sbjct: 63 FLDAFGDPTGRIWFAKMGSTALIRRKQLEMDQNDSGLMIVKDLIKNKIGNQVKFLKTLKN 122 Query: 135 -----------------------------------EQLRGIEGSRVRATYALLAKQYG-- 157 + G+EG+ RA + +LA+ Sbjct: 123 ARPGKEHRFIDTILAIEKILQTLEVTDGENVIDLRNTIMGLEGTSARAYFKVLARAMPEK 182 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 + GR P D N ++ LYG E A + AG P IGF+HT S Sbjct: 183 YRFKGRSRRPAK----DPFNAVLNYCYGMLYGKVEKACIIAGLDPFIGFLHTDNYNKKSL 238 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 V+D+ + + A + D + + + ++ ++ L Sbjct: 239 VFDLIEPFR-IFAETTAVYLF-TGRKMKDDYFDMYEHSVSLNKNGKPVVVEAMDKHLEEK 296 >UniRef50_A5UJ50 Uncharacterized protein predicted to be involved in DNA repair n=5 Tax=Methanobrevibacter RepID=A5UJ50_METS3 Length = 334 Score = 159 bits (403), Expect = 9e-38, Method: Composition-based stats. Identities = 35/292 (11%), Positives = 86/292 (29%), Gaps = 52/292 (17%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 + D V+++K I V+ I + ++ A+ L A+ L+ + G Sbjct: 11 TLRKKDNQIVVMEKDNEIYRISANKVSDITIMAKGHITFDALNLMAKNNIKLISINYFGQ 70 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQ------- 136 Y P Q + + + L + ++ + +++ + Sbjct: 71 INYILESPNQNNIVLKKLQYQASENH-KGLLISKEFITSKMKNQKSTIKTLNKNKKIKEV 129 Query: 137 -------------------------------LRGIEGSRVRATYALLAKQYG--VTWNGR 163 + G EG + ++ + + R Sbjct: 130 KIIENKIKQNIKDFKNFKITLNDNIFTSKNKIMGFEGIASVNYWEAVSLLLPDEINFKKR 189 Query: 164 RYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIAD 221 P++ D +N ++ + L I+ G P G +H + S ++D+ + Sbjct: 190 NQKPEN----DVVNSMLNYGYAILASEIAKNIVTLGLDPYCGLLHADLKRRQSLIFDLIE 245 Query: 222 IIKFDTVVPKAFEIARRNPGEPDREVRL-----ACRDIFRSSKTLAKLIPLI 268 + V F++ N + + + +AK+ + Sbjct: 246 EFRQQIVDKTVFKLINTNQINETNIDKRNNSLKLESRKLLAGEIMAKIHSDL 297 >UniRef50_C5EH11 Crispr-protein cas1 n=1 Tax=Clostridiales bacterium 1_7_47FAA RepID=C5EH11_9FIRM Length = 342 Score = 159 bits (403), Expect = 1e-37, Method: Composition-based stats. Identities = 40/309 (12%), Positives = 91/309 (29%), Gaps = 45/309 (14%) Query: 11 LKDRVSMIFL--QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLA 68 ++ ++ I++ + + + V + + IP +V I+ S A + Sbjct: 3 MRKLLNTIYVTNELAYLWLDGENLVCKIENENKLRIPFDNVENIVCFNYIGCSPALMGKC 62 Query: 69 AQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDE--------------DLRLK 114 + +V G + + Q ++ + R Sbjct: 63 VGKSIPINFVSPQGKFLAKVCGETKGNVFLRVAQIDCFREQGLVLAQNTMAAKFCNTRQV 122 Query: 115 VVRKMF---ELRFGEPAPARRSV--------------EQLRGIEGSRVRATYALLAKQY- 156 + R + LR S+ E + GIEG+ ++ +++ K Sbjct: 123 IKRTLHDNSNLRGDAEIQRTLSILEEGVDKLFQAENMESVIGIEGNCAQSYFSIFGKLIT 182 Query: 157 ----GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKP 212 T+ R P D +N +S + AA+ G IGF H + Sbjct: 183 NQKAPFTFELRTKHPP----LDPVNAILSFVYTLFTNEFAAALETVGLDSYIGFCHALRS 238 Query: 213 L--SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 S D+ + + V + + + ++ + K++ ++ Sbjct: 239 GRSSLACDLVEEARC-IVERFVLTLLNLQIVGEEDFEKQISGAVWLNEDGRKKVLARWQE 297 Query: 271 VLAAGEIQP 279 + P Sbjct: 298 KKRTDMMHP 306 >UniRef50_Q03KT5 CRISPR-associated protein, Cas1 family n=5 Tax=Streptococcus RepID=Q03KT5_STRTD Length = 334 Score = 159 bits (403), Expect = 1e-37, Method: Composition-based stats. Identities = 39/294 (13%), Positives = 87/294 (29%), Gaps = 39/294 (13%) Query: 15 VSMIFLQYG--QIDVIDGAFVLI-DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S ++ Q + + + ++ D I + + V ++L +++ ++ ++ Sbjct: 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP---- 127 + + G + + K QAK +ED RL+V R + + Sbjct: 61 KVNVYYFSNVGQFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHQIALL 120 Query: 128 -----------------------APARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNG 162 S+ ++ G EG ++ + L +NG Sbjct: 121 REFDTDGLLDTSDYSRFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPDDFHFNG 180 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIA 220 R D N ++ S LY I G + G +H + D+ Sbjct: 181 RSRR----TAEDCFNSALNFGYSILYSCLMGLIKKNGLSLGFGVIHKHHQHHATLASDLM 236 Query: 221 DIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 + + V E+ R +D + + + + Sbjct: 237 EEWRPIIVDNTLMELIRNGKLLL-SHFENKDQDFILTDEGREIFAWALRSRILE 289 >UniRef50_P71636 CRISPR-associated protein Cas1 n=11 Tax=Mycobacterium tuberculosis complex RepID=P71636_MYCTU Length = 338 Score = 159 bits (402), Expect = 1e-37, Method: Composition-based stats. Identities = 44/325 (13%), Positives = 96/325 (29%), Gaps = 42/325 (12%) Query: 18 IFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +++ +I DG ++ + + P+ ++ I L ++ + + + Sbjct: 4 LYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRERDI 63 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP-------- 127 G P + + +L Q D L + +++ + Sbjct: 64 QLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIRAHT 123 Query: 128 ------------------APARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGRRYDP 167 S+ +L G EG+ +A + L + GR P Sbjct: 124 SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRSTRP 183 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKF 225 D N +S S LY AI IGF+H + D+ ++ + Sbjct: 184 P----LDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRA 239 Query: 226 DTVVPKAFEIARRNPGEPDREVR-LACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPP- 283 + + + + +F + + + + +A P Sbjct: 240 PIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPH 299 Query: 284 ----EDAQPVAIPLPVSLGDAGHRS 304 + A + + V + +AGH S Sbjct: 300 RYTFQYALDLQLQSLVRVIEAGHPS 324 >UniRef50_O28401 Putative uncharacterized protein n=1 Tax=Archaeoglobus fulgidus RepID=O28401_ARCFU Length = 345 Score = 158 bits (400), Expect = 2e-37, Method: Composition-based stats. Identities = 36/275 (13%), Positives = 82/275 (29%), Gaps = 48/275 (17%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 + + +G V+ +K + + +++ +S A++L + +V++ G Sbjct: 11 YLGIENGLIVVKEKGKALRKVRPEDLKQVLIIGKAAISSDAIKLLLKNRVDVVFLDFNGE 70 Query: 84 RVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV--------- 134 + P + Q D+ + + ++ + + ++ Sbjct: 71 ILGRLSHPLIGTAKTRREQYLAYGDKRG-VHLAKEFIKAKMANQMAILTNLAKARKDSNP 129 Query: 135 --------------------------------EQLRGIEGSRVRATYALLAKQYGVT--W 160 E+L GIEG + + ++ + Sbjct: 130 EVAESLLKAKKEIDACLNELDGVEAEMIDKVRERLLGIEGKASKHYWDAISLVIPEEYRF 189 Query: 161 NGRR--YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFV 216 NGRR D +N ++ S L A+ AG P GF+H S Sbjct: 190 NGRRGIEIGSPRYAKDIVNAMLNYGYSILLAECVKAVELAGLDPYAGFLHVDVSGRSSLA 249 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC 251 D+ + + V + +P+ + Sbjct: 250 IDLMENFRQQVVDRVVLRLISYRQIKPEDCEKRNM 284 >UniRef50_C3MWK6 CRISPR-associated protein Cas1 n=6 Tax=Sulfolobus RepID=C3MWK6_SULIM Length = 341 Score = 157 bits (397), Expect = 4e-37, Method: Composition-based stats. Identities = 38/309 (12%), Positives = 88/309 (28%), Gaps = 55/309 (17%) Query: 4 LPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHA 63 + +KD + + ++ G I K ++ I + I++ + +S Sbjct: 42 MDKKIAFVKDYGAYLKIEKGLITCK-------IKDQVKWSIAPTELHSIIVLTNSSISSE 94 Query: 64 AVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVR------ 117 V++A + G +V+ L Q + ++ + Sbjct: 95 VVKVANEYGIEIVFFNNNEPYAKLIPAKYAGSFKVWLKQLTAW--KRRKVDFAKAFIYGK 152 Query: 118 --------KMFEL------------RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG 157 + +E R + E++ E + + + Sbjct: 153 VHNQWVTLRYYERKYGYDLKSQELDRLAREVMFVNTAEEVMQKEAEAAKVYWRGVKSLLP 212 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SF 215 + + + + D N+ ++ L V A+++ G P IGF+H + S Sbjct: 213 KSLGFKGRRKRVSDNLDPFNRALNIGYGMLRKVVWGAVISVGLNPYIGFLHKFRSGRISL 272 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDR------------------EVRLACRDIFRS 257 V+D+ + + V K R + + + R I Sbjct: 273 VFDLMEEFRSPFVDRKLIGFVRESADKITDLKTVYSLFSDVKEDEIYTQARRLVNAILND 332 Query: 258 SKTLAKLIP 266 + L Sbjct: 333 EEYRPYLAK 341 >UniRef50_B8GDW2 CRISPR-associated protein Cas1 n=1 Tax=Methanosphaerula palustris E1-9c RepID=B8GDW2_METPE Length = 327 Score = 157 bits (397), Expect = 5e-37, Method: Composition-based stats. Identities = 41/276 (14%), Positives = 87/276 (31%), Gaps = 31/276 (11%) Query: 31 AFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQ 90 +++ K G T P+G V +++ G + A ++ G + + G V Sbjct: 26 RLLIVQKNGTTTEYPIGDVHHLLVVGGHTIHSAVLQHMQNAGNWVSFFAADGTPVGLIRP 85 Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------------------------- 125 P +++ + A L + R R Sbjct: 86 PEDRVDEQVRAIQRHAPAHSYALGITRAALGRRLQVIGETTVVTGESPLYQGELEVLQDA 145 Query: 126 -EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 + +++++R + Y ++A+ R D +N +S + Sbjct: 146 RQELEYLVTLDEIRRLHRLATDMYYEIMARTIPKGTGFR--RRTARPYMDPVNTMLSFSY 203 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPD 244 L GV + A IG +H G + V D+ ++ K V F + R+ D Sbjct: 204 GILSGVCAVHLAGAHLDANIGLLHQG-ERALVRDLTELFKPQMVDQPIFALVRQGITASD 262 Query: 245 REVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 E + S + +++ ++ + I Sbjct: 263 YE--IGESRCTLSDALIRRMLLHLQTSIEVTAIGRQ 296 >UniRef50_C3MX12 CRISPR-associated protein Cas1 n=11 Tax=Sulfolobaceae RepID=C3MX12_SULIM Length = 304 Score = 157 bits (396), Expect = 6e-37, Method: Composition-based stats. Identities = 43/295 (14%), Positives = 99/295 (33%), Gaps = 38/295 (12%) Query: 6 LNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAV 65 + + + + + I+++ ++I K + I V I++ +S +A+ Sbjct: 1 MRTLVISEYGAYIYVKK--------NMLVIKKGDNKVEISPSEVDEILITASCSISTSAL 52 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPG-GARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 LA G ++++ Q + + + ++ ++ + Sbjct: 53 SLALTHGISVMFLNSRDTPWGILLPSVITETVKTKKAQYETIVAKKD-IRYGEEIISSKI 111 Query: 125 GEPAPA----------RRSVEQLRGI-EGSRVRATYALLAKQYG--VTWNGRRYDPKDWE 171 + R ++L G E + R + +++ + ++GR D + Sbjct: 112 YNQSVHLKYWTRLTGTRNDYKELLGKDEPTAARIYWRNISQLLPKDIGFDGR-----DVD 166 Query: 172 KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVV 229 D N ++ + + LY ++ AG P +GF+H +P S VYD +++ K Sbjct: 167 GVDQFNMALNYSYAILYNTIFKYLVIAGLDPYLGFIHKDRPGNESLVYDFSEMFKPYI-D 225 Query: 230 PKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPE 284 R RL +D + L LI + + Sbjct: 226 FLLVRALRSG-------FRLKVKDGLIEENSRGDLAKLIRKGMEEKVKEESDHNP 273 >UniRef50_B1YCK7 CRISPR-associated protein Cas1 n=2 Tax=Thermoproteaceae RepID=B1YCK7_THENV Length = 303 Score = 155 bits (392), Expect = 2e-36, Method: Composition-based stats. Identities = 43/303 (14%), Positives = 86/303 (28%), Gaps = 45/303 (14%) Query: 15 VSMIFLQYGQ-IDVIDGAFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVG 72 + ++ ++G + A V+ + G IP+ V + +L G +S VR A+ Sbjct: 1 MEVVVKEHGVSLGYSRWALVVRRRGGAAERIPIHQVDRLWILTGGVSISSRLVRALARSF 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARS-DKLLYQAKLALDEDLRLKVVRK------------- 118 +V+ G P + Q + L+ L++ + Sbjct: 61 VDVVFFDGRGNPAARLFPPEANGTVAHRRAQYEAYLNGRG-LELAKLVVYGKIVNQAAAL 119 Query: 119 ----MFELRFGEPA--------------PARRSVEQLRGIEGSRVRATYALLAKQYGVTW 160 ++ + P + + G EG +A L+K +G Sbjct: 120 RRAGLWRRELYQELAGAASRVAEAAAAVPRCGDPQCVLGHEGRAAAEYWAALSKAFGTP- 178 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYD 218 +D D N ++ L + G P G++H K S V D Sbjct: 179 ------TRDPNASDPFNLALNYGYGILRYAVWRQAVIHGLDPYAGYLHVDKSGRPSLVLD 232 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 + + + + ++ R +L P I + Sbjct: 233 LMEEFRPHI-DLMVLKAKPSADWLEGGVLKREARAALVEKWLEMRLEPTIARQVGLAVAH 291 Query: 279 PPA 281 Sbjct: 292 LEG 294 >UniRef50_C9LGP6 CRISPR-associated protein Cas1 n=3 Tax=Prevotella RepID=C9LGP6_9BACT Length = 295 Score = 155 bits (392), Expect = 2e-36, Method: Composition-based stats. Identities = 41/298 (13%), Positives = 106/298 (35%), Gaps = 25/298 (8%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKT--GIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQ 70 + S++F+ + + +G V+I K +P+ + +M+ ++ + + Sbjct: 3 KRSLVFMHPATLSLRNGQMVIIRKEIPDDNLIVPIEDIGLVMINHAMVSLTIPLLNALTE 62 Query: 71 VGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPA 130 +++ E G+ + + + + L + + + +++ Sbjct: 63 QNVAVIFCNEKGMP---ASMLYNLQGNTTQGE-TLHNQLEAGEVLKKTLWKQIIEAKIKN 118 Query: 131 RRSVEQLRGIEGSRVRATY------------ALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 + ++ G EGS ++ Y + A+ Y GR + G IN Sbjct: 119 QAALLNKMGKEGSILKPLYTNVKSGDSDNREGIAARLYWTALFGRDFIRDRNIPG--INS 176 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIA 236 ++ S L A++++G PA+G H + +F D+ + + V FE+ Sbjct: 177 LLNYGYSVLRAAVTRALVSSGLFPALGIFHHHRSNAFPLSDDLMEPFRP-FVDEIVFELT 235 Query: 237 RRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ-PPAPPEDAQPVAIPL 293 + E + + + +K+ + L+ ++ + + +PL Sbjct: 236 AQGEAELNTATKSRLIRVLYVDTYFSKITRPLSVGLSMTMASLAKCYAKEQKKLVVPL 293 >UniRef50_Q1CW50 CRISPR-associated fusion protein Cas4/Cas1 n=5 Tax=Bacteria RepID=Q1CW50_MYXXD Length = 568 Score = 154 bits (390), Expect = 3e-36, Method: Composition-based stats. Identities = 47/308 (15%), Positives = 88/308 (28%), Gaps = 48/308 (15%) Query: 7 NPIPLKDRVSMIFL--QYGQIDVIDGAFVLI--DKTGIRTHIPVGSVACIMLEPGTRVSH 62 P D ++ + ++ V+ + G + P V+ ++ +VS Sbjct: 222 RLFPEDDVRQVLHVATPGTRVGRAAEELVVTPPEGEGAPSRQPGRMVSALIAHGAVQVSA 281 Query: 63 AAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFEL 122 A+ + + W R G + L Q + + L + R++ Sbjct: 282 QALAYCVENDIGVHWFTSG-GRYLGGLGGGAGNVHRRLRQFEALRQASVCLGLARRLVAA 340 Query: 123 RFGEPA-------------------------------PARRSVEQLRGIEGSRVRATYAL 151 + S+E L G+EG+ + Sbjct: 341 KLEGQLRFLLRASRGDSESRQVLASAVRDLRALLPKCEEAPSLEVLLGLEGAGAARYFGA 400 Query: 152 LAKQ------YGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIG 205 L + + GR P D N + ++ EAAI A G A G Sbjct: 401 LPYLQGEDVDTRLRFEGRNRRPPR----DRFNAVLGFLFGLVHREVEAAIRAVGLDVAFG 456 Query: 206 FVHTGK--PLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAK 263 F H + D+ ++ + R + D + + ++ S AK Sbjct: 457 FYHQPRGTAGPLGLDVMELFRVPLADMPLVASVNRRAWDADADFEVTSEHVWLSKAGRAK 516 Query: 264 LIPLIEDV 271 I L E Sbjct: 517 AIELYERR 524 >UniRef50_C5SD37 CRISPR-associated protein Cas1 n=1 Tax=Allochromatium vinosum DSM 180 RepID=C5SD37_CHRVI Length = 346 Score = 153 bits (386), Expect = 9e-36, Method: Composition-based stats. Identities = 55/291 (18%), Positives = 90/291 (30%), Gaps = 43/291 (14%) Query: 17 MIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTL 74 ++ + + Q+ + GA L+ HIP+G + +++ V R A+ G Sbjct: 6 LLVIDHRDTQLTLDGGALRLVTPGTKPRHIPLGVLGLVVVHGRALVGCDVWRALAERGIP 65 Query: 75 LVWVGEAGVRVYAS-GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP---- 129 V G V A G GA Q + A + RL + R + + A Sbjct: 66 AVMQPGRGRGVCAWMGPALGATGVLRAAQHRAAEQVERRLMLARDLIAAKLLAQARVIER 125 Query: 130 --------------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTW--N 161 S ++ G+EGS A + L TW Sbjct: 126 LPVASDPPETRTAVRRQQDLALARLGQASSTTEVMGLEGSAAAAWFRWLTLWLSPTWGFQ 185 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDI 219 GR P D +N +S + L G + I G PA G +H S V D+ Sbjct: 186 GRNRRPPR----DPVNALLSLGYTLLGGEMLSVIQQQGLDPARGLLHELVPGRESLVLDL 241 Query: 220 ADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 + ++ V + R P+ S + + Sbjct: 242 IEPLRPS-VDLVLLGMLDR-LLTPEDFTTSPEDGCRLSKEARGRFYQAWAQ 290 >UniRef50_A3XI90 Putative uncharacterized protein n=1 Tax=Leeuwenhoekiella blandensis MED217 RepID=A3XI90_9FLAO Length = 332 Score = 152 bits (385), Expect = 1e-35, Method: Composition-based stats. Identities = 52/304 (17%), Positives = 94/304 (30%), Gaps = 52/304 (17%) Query: 24 QIDVIDGAFVLIDKTGIRT-----HIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWV 78 + V D F + + + HI V ++ G ++ A+ LA + +V V Sbjct: 11 YLHVKDDMFEIKVREDKKQPFNVNHIAAHKVTSFIVSKGAALTTDAIALALKHNIDIVLV 70 Query: 79 GEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP--------- 129 G + + K+ Q +A + +++ + A Sbjct: 71 ENNGHPMGRFWHSKLGSTTKIRKQQLVASLNQTGVYWIKEWLSQKLENQADYLNDLKKHR 130 Query: 130 ------------------------ARRSVEQL----RGIEGSRVRATYALLAKQYG--VT 159 + QL RG EGS R + LA + Sbjct: 131 KNLHVYLDEKSAAILGFRKKIKEADGADINQLAESFRGWEGSAGRHYFEALATCIPDAYS 190 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVY 217 + GR + P D N ++ A LY E A++ AG P +GF+H S VY Sbjct: 191 FKGRSFRP----AQDEFNALLNYAYGILYSRVERALMLAGLDPFVGFMHRDDYNSKSLVY 246 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 D + + F+ + + + A +P + L + +I Sbjct: 247 DFIEPYR-IYAERFVFKSFSSKKMN-KSYFEVFEGGVSLNPDGKAFFVPEYLEYLDSDKI 304 Query: 278 QPPA 281 + Sbjct: 305 KYKG 308 >UniRef50_A8ABK8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccus hospitalis KIN4/I RepID=A8ABK8_IGNH4 Length = 305 Score = 152 bits (383), Expect = 2e-35, Method: Composition-based stats. Identities = 56/313 (17%), Positives = 100/313 (31%), Gaps = 51/313 (16%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + + G++ V DG + G ++ I+ V+ AA+R ++G LV Sbjct: 4 FVISEPGKLFVKDGGLAFANSKGEVAYLANLYDVIILATSKVSVTGAALRAMGRLGVDLV 63 Query: 77 WVGEAGVRVYASGQP-GGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP-------- 127 + G P + L Q ++ L + LK + M + E Sbjct: 64 VLEWNGRPSGRFSSPVPNKSALARLKQYEVVL-KGEGLKYAKPMIVRKIIEQGRTLRYFA 122 Query: 128 ---------------------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 A + S E LR +E R + LL++ + + GR Sbjct: 123 KTKRMKWLREASYELEKLSADASSAGSPEALRAVEAQAARLYWGLLSEAFP-EFPGRE-- 179 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 E D N I+ + LY A+ G P G H K + VYD ++ K Sbjct: 180 ---HEGCDPYNSAINYSYGILYSYAFKALSVTGLDPYAGLFHAIKSGREALVYDFSEQFK 236 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAPPE 284 V +A +A + + + L I+ +L + + A Sbjct: 237 P-LVDRRALPLAHELEVD----------GCSLTYSSRKALGEEIKKLLESCDKTLLAEAW 285 Query: 285 DAQPVAIPLPVSL 297 + ++ Sbjct: 286 NLA-SSMREGREY 297 >UniRef50_C0W0W5 CRISPR-associated Cas1 family protein n=1 Tax=Actinomyces coleocanis DSM 15436 RepID=C0W0W5_9ACTO Length = 287 Score = 151 bits (381), Expect = 3e-35, Method: Composition-based stats. Identities = 50/290 (17%), Positives = 95/290 (32%), Gaps = 42/290 (14%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 GQI GA + +P+ VA +++ S A+ G ++ G Sbjct: 3 GQISSARGAISIEPDGKEPVLVPISDVAVLLIGHRVVFSGGALHRCLSAGVAVMLCDWRG 62 Query: 83 VRVYAS--GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG- 139 V + + + + QA+L E R +++ + + A A + LRG Sbjct: 63 VPEGGAFGWSDHTRVAARRIAQAQL--SEPRRKNAWKQIIKEKLRGQASALDDLG-LRGG 119 Query: 140 -----------------IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 +E + + L G P +N + Sbjct: 120 DFLRELRKQVRSGDPANVEAQAAKFYWKALGG------EGFNRVPGARFG---VNGMLDY 170 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNP 240 A + + G A+L+AG P++G H G+ +F D+ ++ + V + F + Sbjct: 171 AYAIVRGHGIRAVLSAGLEPSLGVFHHGRSNAFCLVDDLLEVFRPA-VDAQVFGLLGDGE 229 Query: 241 GEPD----REVRLACRDIFRSSKTLAKLIPLIEDVLA---AGEIQPPAPP 283 E D V +AC T+ + +++ PP Sbjct: 230 VEFDEVKHDLVDIACGKFSVDGLTIPAVFEDFAQQFGLYIEDDVEKLVPP 279 >UniRef50_Q2FPW6 CRISPR-associated protein Cas1 n=1 Tax=Methanospirillum hungatei JF-1 RepID=Q2FPW6_METHJ Length = 303 Score = 149 bits (377), Expect = 1e-34, Method: Composition-based stats. Identities = 42/282 (14%), Positives = 84/282 (29%), Gaps = 31/282 (10%) Query: 26 DVIDGAFVLI-DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + G T IP+ ++ +L G + + + + G + + G Sbjct: 7 HIKSNRTEITIQHKGKITDIPIKDLSHFLLIGGHTIQTSTITSLVKEGVFISFCESDGEP 66 Query: 85 V-YASGQPGGARSDKLLYQ--------AKLALDEDLRLKV-VRKMFELRFG--------- 125 V Y S + Q A +E ++ ++ + + G Sbjct: 67 VGYISPYDYSLFKEIQNLQKTAAPYSYALACANESIKSRILAIEKYAEEIGPEILFSGEL 126 Query: 126 -------EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 + +E+LR IE Y +L + T+ + D +N Sbjct: 127 DILTGYAKELENMVLIEELRRIEQLVRDMYYEILGRLISPTYLFK--RRTSRPYLDPVNA 184 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARR 238 S L A++ P G+++ G + V D+ + K + A R Sbjct: 185 IFSFGYGMLSSACTRAVIGGHLDPGHGYLNRGNQ-ALVQDLMNCWKPKMIDNHAIGFLRS 243 Query: 239 NPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 + R R + + +LI L + I Sbjct: 244 GRLHQNGYERTKDR-CILHDEVIEELIHLFSKSIQEELINTQ 284 >UniRef50_A6UVX9 CRISPR-associated protein Cas1 n=2 Tax=Methanococcales RepID=A6UVX9_META3 Length = 334 Score = 149 bits (377), Expect = 1e-34, Method: Composition-based stats. Identities = 44/295 (14%), Positives = 90/295 (30%), Gaps = 53/295 (17%) Query: 25 IDVIDGAFVLI-DKTGI--RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWV--- 78 I + F + +K G +T IP + I+L ++ A+ LA + ++++ Sbjct: 12 ISKKEDRFRVELEKDGKIIKTDIPATKIERIILSNNGTITTGAISLAIENNIPIIFLKWE 71 Query: 79 ----------------GEAGVRVYASGQPGGARSDKLLYQ---------------AKLAL 107 ++ G + + Q Sbjct: 72 EPTGMIWHCRIGKTAKTRRSQLKFSESINGIRFASNWIKQKMGNQLNYLKDLKKNHNHKG 131 Query: 108 DEDLRLKVVRKM------FELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VT 159 D D +K + + + P R + + GIEG + + + Sbjct: 132 DFDTVIKNINNYIMELTQYTNNIDDKLPHREIKDTIIGIEGIASKYYFEGINHALPKNYK 191 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVY 217 + R P D N ++ LY + E ++ AG P +GF+H + VY Sbjct: 192 FKERSRRPAR----DKFNALLNYGYGMLYPMVEKCLIVAGLDPYVGFIHADNYNKTTLVY 247 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 D+ ++ + + + I S + P I+D++ Sbjct: 248 DVIEMYRAHI-DRGVVNFINSKKVK-GSHFNILENGISLSDDGKGEFAPYIKDIV 300 >UniRef50_C8PKY6 Putative CRISPR-associated protein Cas1 n=1 Tax=Campylobacter gracilis RM3268 RepID=C8PKY6_9PROT Length = 731 Score = 148 bits (375), Expect = 2e-34, Method: Composition-based stats. Identities = 39/246 (15%), Positives = 83/246 (33%), Gaps = 32/246 (13%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + + G VL K I+ P+ ++ I++ +S A ++ A+ + ++ E Sbjct: 421 LALSQGKLVLKSKGAIKHKFPINQISQIIINAQISLSSAVIKECAKKKISINFIDEKTNL 480 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA---------------- 128 YA+ + K L L++ ++ + Sbjct: 481 SYATLISANSAIPKTAASQISLLTTKKSLRIAQQFIIGKLKNQINYLKYLGKYHKNLGAE 540 Query: 129 ------------PARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 P SV +L G EGS + + +AK + D + Sbjct: 541 IKAMQEILKLRVPGAASVSELMGFEGSAANSYWQAIAKAVDYEFGFSA--RVTQGATDIV 598 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFE 234 N ++ + LY +I AAG +P + ++H + + +D+ + + V Sbjct: 599 NSALNYGYAILYSKILKSIAAAGLSPHVSYLHALDEQKPTLAFDLIEEFRAFIVDRAVIS 658 Query: 235 IARRNP 240 + +N Sbjct: 659 MVNKNE 664 >UniRef50_D2QT50 CRISPR-associated protein Cas1 n=1 Tax=Spirosoma linguale DSM 74 RepID=D2QT50_9SPHI Length = 351 Score = 148 bits (374), Expect = 2e-34, Method: Composition-based stats. Identities = 60/313 (19%), Positives = 102/313 (32%), Gaps = 60/313 (19%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPV----GSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVG 79 + V D F + K I V I+L GT +S AVRLA + +V++ Sbjct: 11 YLHVKDAMFDVRRKGEDGKVISATYSAEKVTHILLATGTSLSTDAVRLAMRHNVDIVFIE 70 Query: 80 EAGVRVYASGQPGGARSDKLL-------------YQAKLA--LDEDLRLKVVRKMFELR- 123 + G + + K+ + D ++ +R + + R Sbjct: 71 QQGDPIGRVWHAKLGSTTKIRKRQLEASLGPDGLRWVRAWLLAKLDNQMGFIRSLKKHRP 130 Query: 124 -------------------------FGEPAPARRSV----EQLRGIEGSRVRATYALLAK 154 GE PA V + LRG+EG+ R + L+ Sbjct: 131 QHAGYLDDKLVRIEAMALSISTLASVGEQTPATTCVADVADTLRGLEGTAGRLYFETLSY 190 Query: 155 QYG--VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG-- 210 ++GR P D N ++ LYG E ++ AG P +GF+H Sbjct: 191 VLPKEYQFSGRSSRP----AQDAFNAFLNYGYGMLYGKVEKTLMMAGLDPYVGFLHRDDY 246 Query: 211 KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDR--EVRLACRDIFRSSKTLAKLIPLI 268 LS VYD + + F + EV + + ++ A L+ Sbjct: 247 NQLSMVYDFIEPYRGW-TDEVVFRLFTAKKVNKAHIGEVSGSRTGVSLNADGKALLVNAF 305 Query: 269 EDVLAAGEIQPPA 281 + + I+ Sbjct: 306 NECMDNDPIRYRG 318 >UniRef50_A1WUP2 CRISPR-associated protein Cas1 n=1 Tax=Halorhodospira halophila SL1 RepID=A1WUP2_HALHL Length = 320 Score = 148 bits (373), Expect = 3e-34, Method: Composition-based stats. Identities = 50/295 (16%), Positives = 98/295 (33%), Gaps = 45/295 (15%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 + +++ ++++ A + + +P+ + +++ +S + A+ G Sbjct: 1 MGTLYIDRRRTRLELAHKALTIREPEAQPRSVPLSLIDRLIVIGQVELSSGVLTTLAESG 60 Query: 73 TLLVWVGEAGVRVYASGQP-GGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG------ 125 LV++ G R A + G + + L Q +L E R R++ LR Sbjct: 61 VSLVFMPSRGQRRSAFLRSEGHGDAVRRLGQYRLIHLEAERQAWARRLVRLRLAGQQRLL 120 Query: 126 ------------------EPAPARRSV--------EQLRGIEGSRVRATYALLAKQYG-- 157 A ++ EQLRG EG+ A + + Sbjct: 121 ASALYRRPDQRQPLTAAHREIEAAQATVRREAPAGEQLRGQEGTAAAAFFRGYGALFAEA 180 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSF 215 + ++GR P D +N +S + +G A+ AAG PAIG +H S Sbjct: 181 LGFSGRNRRPPR----DPVNAVLSLGYTLAHGDALRAVTAAGLDPAIGVLHEPAWGRDSL 236 Query: 216 VYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 D+ +I + V +E+ + + E Sbjct: 237 ACDLTEIARAR-VERLTWELFASETLQRTDFTNSTE-GVRLGKAARQTFFGCWER 289 >UniRef50_D0YU98 Crispr-associated protein Cas1 n=1 Tax=Mobiluncus mulieris 28-1 RepID=D0YU98_9ACTO Length = 309 Score = 147 bits (370), Expect = 6e-34, Method: Composition-based stats. Identities = 45/296 (15%), Positives = 97/296 (32%), Gaps = 41/296 (13%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 G + +GA V+ + G +P+ VA ++ TR S +V + T +++ Sbjct: 18 RGFVTSTEGALVVRPEEGEERRVPISDVAVVLFGVDTRFSAGSVHRILKNDTAVIFCDWK 77 Query: 82 GVRVYAS--GQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA-------PARR 132 GV + + + QA ++ + R + G+ P + Sbjct: 78 GVPYGHAYPWGDHTRVGARQIAQANASIPARKSVW-ARLIKSKVLGQAEVLEFFGRPNGK 136 Query: 133 SVEQLR---------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT--INQCIS 181 +++L IEG R + L W+ + + GD IN + Sbjct: 137 RLKELVKDIRSGDPSNIEGQAARIYWESL-------WDDQDFRRTPGAGGDIFTINAMLD 189 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIARRN 239 + L G A+ A+G ++G H G+ D + + + + Sbjct: 190 YGYTILRGHAMRAVAASGLISSLGVAHRGRSNPWNLADDFIEPFRPAI-DYSVYLLVSEC 248 Query: 240 PGEPDREVRLAC----RDIFRSSKT-----LAKLIPLIEDVLAAGEIQPPAPPEDA 286 + + E++ D+F +++ + L ++ + P Sbjct: 249 -IDDEAEIKKRLVNSAGDVFTATRKSIPAEMTLLAQHYGQLIEGDIGKFGVPSWKP 303 >UniRef50_A3CTI4 CRISPR-associated protein Cas1 n=1 Tax=Methanoculleus marisnigri JR1 RepID=A3CTI4_METMJ Length = 392 Score = 146 bits (369), Expect = 8e-34, Method: Composition-based stats. Identities = 43/274 (15%), Positives = 87/274 (31%), Gaps = 31/274 (11%) Query: 33 VLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR---VYASG 89 ++I + P+ +V +++ G + +AV + G + G +Y G Sbjct: 105 LIIARGSDTRRYPIQAVKHLLIVGGHTLHTSAVTNLLKAGAAITIFDIDGTPVGYLYPYG 164 Query: 90 QPGGARSDKLL--------YQAKLALDEDLRLKVVRKMFEL---------------RFGE 126 Q RL ++ ++++ + E Sbjct: 165 YRPDESVRLAQERAGPHRFAQPLARASLQSRLLLLEELYDHAGHDIFYAGELDFLHQARE 224 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 A ++E LR + Y +L++ R D +N + + Sbjct: 225 ELSASVTMENLRRLSRLTTDMYYEILSRTLPPELGFR--RRTSRPYLDPVNAMFALGYAM 282 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDRE 246 +YG +++ A P +G +H G SFV+D+ + K V AR D E Sbjct: 283 IYGNCCVSVVGAHLDPDLGMLHEG-AGSFVHDLIEPQKASMVDRAVIRFAREEISSGDYE 341 Query: 247 VRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPP 280 + + ++L + D + I Sbjct: 342 C--GEKRCYLGGDLSSRLAAALRDSIDQARIDAQ 373 >UniRef50_C8N6V7 Putative uncharacterized protein n=1 Tax=Cardiobacterium hominis ATCC 15826 RepID=C8N6V7_9GAMM Length = 334 Score = 146 bits (368), Expect = 1e-33, Method: Composition-based stats. Identities = 40/297 (13%), Positives = 89/297 (29%), Gaps = 44/297 (14%) Query: 15 VSMIFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 +S +++ ++ V + + + +P+ + I ++ +S A + + Sbjct: 1 MSTLYIDRQNTRMTVSGNTLIFYENGERASTLPLHIIDRICIKGDLALSAADLGKLGEHN 60 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA---- 128 ++ + + + + L QA + D + + + Sbjct: 61 IGVLILSGREQQPTIYLPCARKDALRRLAQAHFSQDNTFCTRQAQSWITEKIQREQDLLR 120 Query: 129 ----------------------------PARRSVEQLRGIEGSRVRATYALLAKQYG--V 158 + LRGIEGS +T+A +A + Sbjct: 121 ELQTRPHRGGHQLHENLEQLEKSRLRLQNPINDLATLRGIEGSAASSTFAAIACVLPESL 180 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFV 216 + R +P D N +S + L+ I G P IG+ H+ + S Sbjct: 181 HFTKRNRNPPR----DPYNVGLSLGYTLLHYAMVRQIHLTGLDPCIGYYHSIEHGRESLA 236 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 D+ + ++ E P+ + + A+ P E+ L Sbjct: 237 CDLIEAMRPLVTAWTI-EAFHDRILRPE-DFTMQNEACQMGKAARARYYPAFENALK 291 >UniRef50_B3PMY9 Putative uncharacterized protein n=1 Tax=Mycoplasma arthritidis 158L3-1 RepID=B3PMY9_MYCA5 Length = 297 Score = 145 bits (367), Expect = 2e-33, Method: Composition-based stats. Identities = 49/280 (17%), Positives = 94/280 (33%), Gaps = 28/280 (10%) Query: 33 VLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVGEAGVRVY-ASGQ 90 ++++K G + IP + ++ E ++ + G ++ G + Sbjct: 21 LVVNKDGTKITIPTSQIDTVLFENDKLTITLPLINDLVDHGINIIVCGRNHMPKAQIIPF 80 Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG-----IEGSR- 144 G + L Q K D R V R++ +L+ + + LR +E Sbjct: 81 QGYYNAKILQKQIKWDNDFKER--VWRRIIKLKIRQTMVMLEHLAILRDDDKVKLEDYAN 138 Query: 145 ------VRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAA 198 + AK G ++ K DT N ++ S L AI + Sbjct: 139 SVKAFDITNCEGHCAKLNFKILFGEKFVRKTNTMEDTYNAYLNYGYSVLLSYVARAICSK 198 Query: 199 GYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAFEIARRNPGEPD-REVRLACRDIF 255 GY +G H D+ + + V ++ R + GEP+ +E + IF Sbjct: 199 GYDNRLGIFHRSYSNFYPLACDLMEPFRC-IVESLVYKHVRISRGEPNFQEFKEELFTIF 257 Query: 256 RSS--------KTLAKLIPLIEDVLAAGEIQPPAPPEDAQ 287 + + L+ VL+ +I+ P A+ Sbjct: 258 YKHFNCQGENLMLIDCIDKLVVLVLSDHDIKGPYFNWSAE 297 >UniRef50_B4ATI8 Crispr-associated protein Cas1 n=5 Tax=Proteobacteria RepID=B4ATI8_FRANO Length = 318 Score = 145 bits (366), Expect = 2e-33, Method: Composition-based stats. Identities = 46/274 (16%), Positives = 92/274 (33%), Gaps = 37/274 (13%) Query: 17 MIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 +I +YG + + G V+ +K + P+ +A + + G S V + G Sbjct: 4 LIISEYGIYLGLESGRLVVKNKDDKKY-FPLNRLATLSIAKKGVSFSSDLVEQFSLRGIK 62 Query: 75 LVWVGEAGVRV-YASGQPGGARSDKLLYQAKLALDEDLRLKV---VRKMFE--------- 121 L ++ GV G A + Q + L L + K+ Sbjct: 63 LFFLDFRGVAHSMLVGANQHAVVQARINQYRYIDRNALTLSIKLIAAKIKNQRATLNYFN 122 Query: 122 ---------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 R + +++ + G EG ++ LAK ++ + + Sbjct: 123 KHHKSINLLNAIEELKRVAQLIKNAKTLNDVLGYEGYAANIYFSSLAKDKFLSASF--AN 180 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIK 224 + + N ++ + L AI AG P +GF+H +P S V D+ + + Sbjct: 181 REGRGSQEIANSMLNFGYAILSSYILNAITNAGLEPYLGFLHQKRPGKMSLVLDLMEEYR 240 Query: 225 FDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS 258 V ++ R + + + + I S Sbjct: 241 AWVVDRVVIKL--REQYKNKQYIDTKLKSILISE 272 >UniRef50_UPI0001BCCAFD hypothetical protein SnoxA4_00467 n=1 Tax=Selenomonas noxia ATCC 43541 RepID=UPI0001BCCAFD Length = 281 Score = 145 bits (365), Expect = 2e-33, Method: Composition-based stats. Identities = 33/251 (13%), Positives = 65/251 (25%), Gaps = 37/251 (14%) Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELR- 123 + ++G + W+ G + Q L D L + RK+ + Sbjct: 1 MVELLRLGIPVTWLSRTGYFFGRLESTRHVNVFRQERQI-LLRDSFFYLAMARKVIAAKA 59 Query: 124 ----------------------------FGEPAPARRSVEQLRGIEGSRVRATYALLAKQ 155 + P + QL G EG+ + + L Sbjct: 60 HNQFILLRRYNRSASLPEVRTAMAEITALSKHIPRCETNTQLMGYEGAIAKVYFRALGLL 119 Query: 156 YGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPL 213 + D N +S + L + G P GF+H + Sbjct: 120 VPEAFAF--VKRSRRPPMDPFNTMLSFGYTLLMYDLYTVVNNEGLHPYFGFLHALKNRHP 177 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLAC---RDIFRSSKTLAKLIPLIED 270 + D+ + + V + + P+ IF + + A + E Sbjct: 178 ALASDLMEEWRPVLVDAMVLSLVHHHEMRPEHFAPSEEEGRPGIFLTREGRAIFLRAYEK 237 Query: 271 VLAAGEIQPPA 281 + A + Sbjct: 238 KMRATSLYGGG 248 >UniRef50_B8D4S7 CRISPR-associated protein Cas1 n=1 Tax=Desulfurococcus kamchatkensis 1221n RepID=B8D4S7_DESK1 Length = 328 Score = 142 bits (358), Expect = 2e-32, Method: Composition-based stats. Identities = 37/291 (12%), Positives = 89/291 (30%), Gaps = 49/291 (16%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ G ++ K +P+ + +++ G S +R + G + + G Sbjct: 12 RLRYRKGLMLIESKNEKN-EVPLTDLEQVVIETGGVWFSSKLIRKMVEYGIDFIVLDHRG 70 Query: 83 VRVYASGQPG-GARSDKLLYQAKLALDED--------------LRLKVVRKMFELRFGEP 127 P + Q + ++++ + E Sbjct: 71 YPAGRLYPPFISRTVETRRAQYAAYESWRGVHVMRELVYSKLANQAGLLKRYYMYTIMEE 130 Query: 128 APAR-------------------RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPK 168 + E+LR +E R + L+ ++ + Sbjct: 131 LKDAYKKIIELAYRARMLEAEFEEAREKLRQLEAEAARIYWPSLSILIPKELG---FNSR 187 Query: 169 DWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFD 226 D + D +N ++ A LYG + ++ AG P GF+HT + +D+ ++ + Sbjct: 188 DQDSEDLVNTSLNYAYGILYGESWKVLVLAGLDPYAGFMHTDRSGKPVLAFDLIEMFR-F 246 Query: 227 TVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 T + R + + ++ A+++ I L + Sbjct: 247 TADSTLLAMYRHG-------WKPRVLNGLLDYESRARIVESIMKTLENTKT 290 >UniRef50_A6DE79 CRISPR-associated protein Cas1/Cas4 n=1 Tax=Caminibacter mediatlanticus TB-2 RepID=A6DE79_9PROT Length = 281 Score = 142 bits (358), Expect = 2e-32, Method: Composition-based stats. Identities = 34/288 (11%), Positives = 92/288 (31%), Gaps = 35/288 (12%) Query: 15 VSMIFLQY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGT 73 ++ + + +++V + ++ DK +P+ + + L ++ + + Sbjct: 1 MNRVIIDRECKLEVRNSQLIVEDKK-----VPLRYIDFLYLIGEIEINTKTIMKLLKEDI 55 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRS 133 ++ +Y +D Q + + R++ + + Sbjct: 56 SILIQNRGFGLIY---PQKSKNNDLKKKQ---YFALKKEVDIAREIIRKKIQKSIDNLVK 109 Query: 134 V-----------------EQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 + + L GIEG+ + + + T + D + Sbjct: 110 LNKKIVVDFKILDEISSKDSLLGIEGNFAKEYFKEYFSLFDKTLT--KGYRSKRPPEDVV 167 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFE 234 N +S + LY ++ G+ I ++H +S D+ ++ + D V +E Sbjct: 168 NALMSYLYTLLYYEIANRLIFYGFEVGISYLHESFRDHMSLASDLLEVFRSD-VDVFVYE 226 Query: 235 IARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAP 282 + ++ + IF S+ ++ I+D +I Sbjct: 227 MFDNKKV-IKKDFTKEKKGIFLRSEKRKEIWSDIKDFFENLKIDEEIA 273 >UniRef50_A3DLB7 CRISPR-associated protein, Cas1 family n=1 Tax=Staphylothermus marinus F1 RepID=A3DLB7_STAMF Length = 341 Score = 142 bits (357), Expect = 2e-32, Method: Composition-based stats. Identities = 39/286 (13%), Positives = 75/286 (26%), Gaps = 48/286 (16%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + G V++ + G PV V +++ +S ++L A G ++ G Sbjct: 19 LRKKHGRIVVVSRGGRE-EFPVRRVREVIISGKAGISTELLKLLADSGIDVLVTSYTGRP 77 Query: 85 VYASGQPGG-ARSDKLLYQAKLALDEDLRLKVV--------------------------- 116 V + Q + D Sbjct: 78 VALFVHARSGGSVRNRIEQYRSLEDGRACKAAAMIISGKLSNQVSNLKYYSKPRKNISQE 137 Query: 117 -RKMFE-----LRFGEPAPARRS------VEQLRGIEGSRVRATYALLAKQYGVTWNGRR 164 + ++E + E + E + IE + + G R Sbjct: 138 SKILYEKAEEIKKLREKLRDIETSDLGKCRENIMSIEAQAANTYWDGMRIVLGKYGFKER 197 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 ++ D +N C++ + L G +L P GF+H +P S VYD+ + Sbjct: 198 VKRGRGKEVDPVNLCLNIIYNKLAGTVWKYVLRFSLDPFQGFLHARRPGKLSLVYDLMEP 257 Query: 223 IKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 + R D+ + K Sbjct: 258 FRP-IADR----FIARFLYRLDQGFLRRASGAELAKLLTKKYYEEF 298 >UniRef50_Q8F874 Putative uncharacterized protein n=1 Tax=Leptospira interrogans RepID=Q8F874_LEPIN Length = 254 Score = 141 bits (355), Expect = 3e-32, Method: Composition-based stats. Identities = 30/187 (16%), Positives = 63/187 (33%), Gaps = 5/187 (2%) Query: 97 DKLLYQAKLALDEDLRLKVVRKMFEL-RFGEPAPARRSVEQLRGIEGSRVRATYALLAKQ 155 L A+ + +E + + + ++ + S+E +RG EG+ + +++ Sbjct: 33 SVLSRTARKSKNESEKQDIKEAIGKIEKNISLLEKAESIESIRGYEGASAKTYFSVFDYC 92 Query: 156 YGVTWNGRRY-DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL- 213 ++ N +S S L A A G P IGF+H +P Sbjct: 93 IIQQKEDFQFHKRTRRPPRSRTNALLSFLYSLLTNDCIAVCQAVGLDPYIGFLHDERPGR 152 Query: 214 -SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVL 272 S D+ + + + F + R + + F + + +LI ++ Sbjct: 153 PSLALDMMEEFRP-FIDRLVFTLINRKQIQVSDFLEKPGSVFFINDDSRKELIKSYQERK 211 Query: 273 AAGEIQP 279 P Sbjct: 212 KEEIFHP 218 >UniRef50_A7I668 CRISPR-associated protein Cas1 n=1 Tax=Candidatus Methanoregula boonei 6A8 RepID=A7I668_METB6 Length = 310 Score = 140 bits (354), Expect = 5e-32, Method: Composition-based stats. Identities = 32/276 (11%), Positives = 81/276 (29%), Gaps = 32/276 (11%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 I ++ G P+ + +++ G ++ + G+ + + G Sbjct: 15 AHIKSTQKKLSIL-AKGKVEEYPLEEIQNLLIVGGHHLNSTTFSHLLRNGSYISFFEPDG 73 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF------------------ 124 + G + + A + + + + R Sbjct: 74 SPLGILKPWGSTCDTTIRALQEGAPRDRYATALAQAALKSRLIAIEHIQDQQGSSLFYEG 133 Query: 125 --------GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 ++++R + Y +L++ ++ R + D I Sbjct: 134 ELQILHNALHNLEYMVKLDEIRRLSDLTANMYYEILSRDIPREFDFR--RRTVRPQCDPI 191 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYDIADIIKFDTVVPKAFEIA 236 N +S + L+G ++ A P +G ++ G + V D+ K + + IA Sbjct: 192 NAMLSFGYAMLFGNCCVPVIGARLDPDLGILYEG-SGALVQDLMASFKAQMIDGVIWSIA 250 Query: 237 RR--NPGEPDREVRLACRDIFRSSKTLAKLIPLIED 270 R N G+ + ++ I++ Sbjct: 251 RDSLNVGDFEITSNRCILSDNLIQNLMSSFRKSIDN 286 >UniRef50_A3MVN2 CRISPR-associated protein, Cas1 family n=4 Tax=Thermoproteaceae RepID=A3MVN2_PYRCJ Length = 294 Score = 140 bits (354), Expect = 5e-32, Method: Composition-based stats. Identities = 55/270 (20%), Positives = 92/270 (34%), Gaps = 39/270 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVG 79 YG +++++ G R P+ V + +L G ++ A+R + G ++ Sbjct: 7 SYGTRIRTRKGLLVVERGGERREYPLHQVDEVFILTGGVSITSRALRALLRAGAVVAVFD 66 Query: 80 EAGVRVYASGQP-GGARSDKLLYQAKLALD----EDLRLKVVRKM---------FELRFG 125 + G + +P G A +K Q A + R V +KM + R Sbjct: 67 QRGEPLGIFMRPVGDATGEKRRCQYAAAAGGRGLQWAREWVWKKMRGQLQNVKAWRRRLA 126 Query: 126 E-------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEK 172 A S ++ E + A + + G + GR D E Sbjct: 127 HYGDYVEQIGRALEALRAAASPGEVMEAEAAAAEAYWRAYGEVTG--FPGR-----DQEG 179 Query: 173 GDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT--V 228 GD +N ++ L + +IL AG P +GF+H K S V D + + V Sbjct: 180 GDPVNAALNYGYGVLKALCFKSILLAGLDPYVGFLHVDKSGRPSLVLDFMEQWRPRVDAV 239 Query: 229 VPKAFEIARRNPGEPDREVRLACRDIFRSS 258 V K G D + RL Sbjct: 240 VAKVAGELATENGLLDHKSRLRVAAAVLEE 269 >UniRef50_D2QAN8 CRISPR-associated protein Cas1 n=2 Tax=Bifidobacterium dentium RepID=D2QAN8_9BIFI Length = 305 Score = 140 bits (353), Expect = 6e-32, Method: Composition-based stats. Identities = 52/296 (17%), Positives = 91/296 (30%), Gaps = 37/296 (12%) Query: 23 GQIDVIDGAFVLIDKTG-IRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 G I G + +P+ A +++ S A + A+ G ++ Sbjct: 15 GTISYERGRMKVRKHGESESVCVPLAQAAVVLIGLRVCCSSAVLHEMAKAGVSVMLCDWR 74 Query: 82 GVRVYASGQPGGARSDKLLYQ-AKLALDEDLRLKVVRKMFELRFGEPAPARRSVE----- 135 G+ A S+ + Q A+ + + K+ + + A S+ Sbjct: 75 GIPDAALHSWTNVPSEVAVRQIAQSEMTLPRKKNAWAKIVKAKIRGQASCLDSLGIEGGG 134 Query: 136 QLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 LRGI EG R + + +G+R+ + N + A Sbjct: 135 ALRGIASSVRSGDTSNYEGYAAREYWKRI-----FIGDGKRFKRIPGDGT-GRNAQLDYA 188 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPG 241 + L G AIL+AG P +G H G+ F D+ ++ + A G Sbjct: 189 YTILRGFAVKAILSAGLIPTLGVNHHGRSNYFCLADDLLEVYRPAIDYWVA--QLEPEDG 246 Query: 242 EPDREVRLACRDIFRSS-----KTLAKLIPLIEDVLA---AGEIQPPAPPEDAQPV 289 D+ V+ D T+ I +I PE P Sbjct: 247 PSDKNVKRYLADSVNQQFDSSGLTIPSSISDFAQQFGLYCEKKIDALEVPEYRGPY 302 >UniRef50_C6I8L1 CRISPR-associated protein n=1 Tax=Bacteroides sp. 3_2_5 RepID=C6I8L1_9BACE Length = 756 Score = 140 bits (353), Expect = 7e-32, Method: Composition-based stats. Identities = 45/282 (15%), Positives = 89/282 (31%), Gaps = 56/282 (19%) Query: 38 TGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARS 96 G + P ++ I ++ G +S A+ + + G + P Sbjct: 446 QGKIINKPSPALKHITVVGKGISLSSNAITYCMNHKIPIDFFDGRGKQYSTVLNPVFLDG 505 Query: 97 DKLLYQAKLALDEDLRL-----------------------------KVVRKMFE--LRFG 125 Q +L L++ ++L K+ K E L+ Sbjct: 506 TLWNKQVELPLEQKIKLATQIIIGKLKNQLNLIKYYHKYHKDILGGKLSEKYVEVVLKID 565 Query: 126 EPAPARRSV--------EQLRGIEGSRVRATY---ALLAKQYGVTWNGRRYDPKDWEKGD 174 + ++ +L IE A + +L G+ + R + D Sbjct: 566 KLIEKAKNYSQRNEKYTAELMAIESQAAIAYWSYIRVLTADDGIDFIRREHQ----GATD 621 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADIIKFDTVVPKA 232 +N ++ + LY ILAA P+IG +H + V+D+ ++ + V Sbjct: 622 LLNSLLNYGYAILYARVWKNILAAKLNPSIGVLHAKQDGKPTLVFDVVELFRAQMVDRVV 681 Query: 233 FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 + ++V L D + + LI I + L Sbjct: 682 ISLI-------QKKVSLKMHDGLLNESSKRVLIRYILERLNR 716 >UniRef50_A0LWB2 CRISPR-associated protein Cas1 n=1 Tax=Acidothermus cellulolyticus 11B RepID=A0LWB2_ACIC1 Length = 295 Score = 138 bits (349), Expect = 2e-31, Method: Composition-based stats. Identities = 49/256 (19%), Positives = 88/256 (34%), Gaps = 33/256 (12%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 G++ GA ++ D+ +P+ VA ++ P + + + AA G +V Sbjct: 14 SGEVHAAQGALLVGDE-----RVPLVDVAMMLTGPYVSLHGSVIDRAAAFGVGVVHCDWR 68 Query: 82 GVRVYA--SGQPGGARSDKLLYQAKL--------ALD--EDLRLKVVRKMFELRFGEPAP 129 GV V A + + QA+L ++ + + LR A Sbjct: 69 GVPVAATLPWSTHNRVAARHRAQAELSLPRQKNAWMNIVKTKIRNQAAVLRALRRDGVAQ 128 Query: 130 ARRSVEQLR-----GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 R Q+R EG+ R +A L + + + D +N + Sbjct: 129 LERLAAQVRSGDASNAEGAAARVYWARL-------FQDKHFRRVPR-ARDVVNGLLDYGY 180 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE 242 + L G A++ AG AP++G H P + V D+ + + V EI Sbjct: 181 AILRGCCLRAVVGAGLAPSLGLWHRRHDNPFTLVDDLIEPFRPA-VDKTVIEIVTAGASG 239 Query: 243 PDREVRLACRDIFRSS 258 DR + + Sbjct: 240 LDRPTKRLLVAVLDHQ 255 >UniRef50_Q7MRD4 Putative uncharacterized protein n=1 Tax=Wolinella succinogenes RepID=Q7MRD4_WOLSU Length = 314 Score = 138 bits (348), Expect = 2e-31, Method: Composition-based stats. Identities = 57/290 (19%), Positives = 101/290 (34%), Gaps = 37/290 (12%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTL 74 +++ + V D V+ + G RT +P+ + +++E P +S A + AQ GT Sbjct: 6 TVVIATSCHLRVRDRQLVIESREGERTQLPLADIGVVIVENPQVTLSAALLSALAQEGTA 65 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE---------LRFG 125 L+ + + G + + E + + +K+ LR Sbjct: 66 LMSCDSSHLPDGVFAPFGTHSRHTRAARVQAVWSEPFKKRCWQKIVRAKVSSQAEVLRRV 125 Query: 126 EPAPARRSVEQLRG---------IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 A R +E L G +E + + L + W+G D Sbjct: 126 GAEDAARRLENLVGKVTSGDTTGVEAQAAQVYWRSLFVNFK-RWDG---------GLDAR 175 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFE 234 N ++ + L G AI AG PA G H G+ SF D+ + + V + Sbjct: 176 NGGLNYGYAVLRGALGRAIAGAGLIPAFGLHHAGELNSFNLADDLIEPFRP-FVDLLVWG 234 Query: 235 IA--RRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQPPAP 282 + R RE R+ I S + E +L+A +I + Sbjct: 235 LFLERTGDDPLTREERMRIAQILGESCIMT---ERYETLLSATQIIARSL 281 >UniRef50_A1WH94 CRISPR-associated protein, Cas1 family n=5 Tax=Proteobacteria RepID=A1WH94_VEREI Length = 309 Score = 138 bits (348), Expect = 2e-31, Method: Composition-based stats. Identities = 43/262 (16%), Positives = 89/262 (33%), Gaps = 37/262 (14%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTH---IPVGSVACIML-EPGTRVSHAAVRLAAQVGTLLV 76 + V G V+ D G R +P+ +A ++ G ++ + A+ G V Sbjct: 11 DRRHLFVNRGFMVIKDTEGERKELGQVPLDDIAAVIANAHGLTYTNNLLVALAERGAPFV 70 Query: 77 WVGEAGVRVYASGQP--GGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP----- 129 G V ++ ++ Q L RL + + + + A Sbjct: 71 LCGPNHNAVGMLLPLDGHHVQAKRIEAQIAAGLPMHKRLWAA--VVKSKLEQQAAALEAV 128 Query: 130 --ARRSVEQLR---------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 ++ L IEG R + LL +G + + + GD +N Sbjct: 129 AAPTAPLQALVAKVRSGDPENIEGQGARRYWGLL---FGAEF-------RRDQSGDGLNA 178 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAFEIA 236 ++ + + A++AAG P+IG H+ + V D+ + + V K +++ Sbjct: 179 LLNYGYTIVRSACARAVVAAGLHPSIGLHHSNDANAMRLVDDLMEPFRP-IVDLKVWQLH 237 Query: 237 RRNPGEPDREVRLACRDIFRSS 258 + + + A + Sbjct: 238 KAGESHVTPDTKRALVRVLYDD 259 >UniRef50_B5IAF4 CRISPR-associated protein Cas1 n=3 Tax=Euryarchaeota RepID=B5IAF4_9EURY Length = 404 Score = 138 bits (348), Expect = 2e-31, Method: Composition-based stats. Identities = 43/299 (14%), Positives = 91/299 (30%), Gaps = 49/299 (16%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSV--ACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGEA 81 I+V V+ +K + +++E +S A+R + + + Sbjct: 13 INVDKRKLVIREKGKQVHEFYPHQINYDSLIIEGYYGNISFEAIRWLMKHNITVSVLNWN 72 Query: 82 GVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM---------------------- 119 G + + Q ++ ++E RLK+ K+ Sbjct: 73 GNLLSVFLPKEPINGKLKIRQYEIYINEKERLKIAEKILEEKIRKSENMLYELSEYYPEI 132 Query: 120 ----FELRFGEPAPARRSVEQ----------LRGIEGSRVRATYALLAKQYG-----VTW 160 + R + +R +E L EG + + L+K + + Sbjct: 133 EHIKVKKRIEKEEKLKRDMELKEENKPKLSYLLMYEGRVAQIYWKELSKIFNKLYPEFNF 192 Query: 161 NGRRYDPKDWE--KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFV 216 R W D IN ++ + + L + I A G P+IGF+H V Sbjct: 193 TSRSTKSYSWNMNASDEINALLNYSYALLESMIRKHINAVGLDPSIGFLHELASSKTPLV 252 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAG 275 YD+ ++ + ++ + + I +T L+ + + Sbjct: 253 YDLQELFR-WVSDLSVIQLLEDKKLKKSSFIVTENYHIRLKPQTSKLLVEKFKLNMNKK 310 >UniRef50_A3ZPG0 Putative uncharacterized protein n=1 Tax=Blastopirellula marina DSM 3645 RepID=A3ZPG0_9PLAN Length = 331 Score = 137 bits (346), Expect = 4e-31, Method: Composition-based stats. Identities = 40/304 (13%), Positives = 95/304 (31%), Gaps = 40/304 (13%) Query: 9 IPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRL 67 + + R I + + + + + IP + +++ PGT +HAA+R Sbjct: 34 VLMIKRTVEISQEPFHLGLRHKQLTFKREGAVVHTIPCEEIGVVVIDHPGTTYTHAALRQ 93 Query: 68 AAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF--- 124 A +V G + ++ + L ++ R++ + Sbjct: 94 LAASDAAVVICGANHLPAAILLPLADHSEVVWRLVEQIEAPKPLCKRLWRQIVRAKIDAQ 153 Query: 125 ----GEPAPARRSVEQLR---------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWE 171 + P R+ ++ L +E + + + G + Sbjct: 154 ASVLPDDCPGRQKLKSLIPLVKPGDAANVEARAAKTYWQFWYPEAGF---------RRDA 204 Query: 172 KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVV 229 D +N ++ + AI++AG PA+G H + F D+ + + V Sbjct: 205 DSDGVNALLNYGYAIARAAIARAIVSAGLTPALGLHHRNRSNPFCLADDLIEPFRP-MVD 263 Query: 230 PKAFEIARRNPGEPDREVRLACRDIFRSSKTLA-----------KLIPLIEDVLAAGEIQ 278 + E+ R+ +E + + +L +++ + L + Sbjct: 264 ERVRELHRQGYDSLTQECKGELLKLLARQTSLDGEKGPFMVQLHRMLGSLVRCLRKKQTS 323 Query: 279 PPAP 282 P Sbjct: 324 LEIP 327 >UniRef50_D2R8Z2 CRISPR-associated protein Cas1 n=1 Tax=Pirellula staleyi DSM 6068 RepID=D2R8Z2_9PLAN Length = 942 Score = 137 bits (345), Expect = 5e-31, Method: Composition-based stats. Identities = 46/256 (17%), Positives = 88/256 (34%), Gaps = 34/256 (13%) Query: 24 QIDVIDGAFVLIDKTGI-RTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 +I V + IP+ ++ +++ +SH + Q L+ + E+ Sbjct: 624 EILVHGDVLQFKHQDQRGIESIPLHTLREVVVLGAVSLSHRVLSAIQQNAISLLLLDESA 683 Query: 83 VRV-YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE-------------------- 121 R Y Q S + Q L + + L + R++ Sbjct: 684 NRTAYIDCQNAEPDSAGIEAQVDLIRNPESSLAIARQLISAKLHNYATLADAYPPKSHSG 743 Query: 122 ------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 +R + A S+ +L GIEGS Y + + ++ D Sbjct: 744 NAHRSLMRLAKDAQQASSLPELLGIEGSGAALWYGEIGMRLSPGFH--WERRVAPNAHDP 801 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDTVVPKAF 233 +N ++ A + LY +T+ AI A +GF+H + + D+ + + Sbjct: 802 VNILLNLAQTVLYRMTQHAIAQAKLVDTLGFLHQPRAGHAALASDMQEPFRHLM-DRAVL 860 Query: 234 EIARR-NPGEPDREVR 248 E RR P E + + R Sbjct: 861 ETVRRIRPEEFEPDER 876 >UniRef50_Q7MTH7 CRISPR-associated protein Cas1 n=3 Tax=Porphyromonas gingivalis RepID=Q7MTH7_PORGI Length = 1031 Score = 135 bits (341), Expect = 1e-30, Method: Composition-based stats. Identities = 34/306 (11%), Positives = 79/306 (25%), Gaps = 61/306 (19%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 I + ++ P + I ++ G +S + + +++ Sbjct: 707 AFIGISRNHVLVRKYGKTICKQPAAQIEQISIISDGVSLSSNVTKYCRKKNIRVIFYNAT 766 Query: 82 GVRVYASGQPGGARSDKLLYQAKL----------------ALDEDLRLKVVRKMFELRFG 125 G + + Q +L ++ L+ K + Sbjct: 767 GQAYASLNGMNTILPSVMEAQMRLSEEKKREFILTLIKNKVRNQGKLLRYYHKYYRHDKE 826 Query: 126 EPAPARRSVEQLRGIEG---------------------SRVRATYALLA---KQYGVTWN 161 P ++ +L+ +EG + + A + G + Sbjct: 827 LKEPLSNAIAELKQLEGIPIAEGSSLADFRQHAMLHEARCAQVYWRAFALLVHRSGHEFE 886 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDI 219 GR + +NQ ++ + L I+ P IG +H + +D+ Sbjct: 887 GREHK----GAEGLVNQMLNYGYAILRSYVMKTIVLWQLNPNIGILHSTQDNKPALCFDL 942 Query: 220 ADIIKFDTVVPKAFEIARRNP-------GEPDREVRLACRDIFRS-------SKTLAKLI 265 + + V + + G D R ++ KL Sbjct: 943 MEQYRAFVVDRSILALLAKGEDVGQNSKGLLDMPTRSRIISKINERWFATEYYRSGEKLF 1002 Query: 266 PLIEDV 271 I + Sbjct: 1003 SDIMKL 1008 >UniRef50_B1GZM4 CRISPR-associated protein Cas1 n=1 Tax=uncultured Termite group 1 bacterium phylotype Rs-D17 RepID=B1GZM4_UNCTG Length = 298 Score = 135 bits (340), Expect = 2e-30, Method: Composition-based stats. Identities = 32/238 (13%), Positives = 79/238 (33%), Gaps = 10/238 (4%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTL 74 ++ F + + V + + IP+ ++ I+LE ++ A + +Q + Sbjct: 5 TVFFSKPCSLSVKNSQLYCRFQDSTVHDIPIEDISVIVLESNRINLTSALISECSQSNIV 64 Query: 75 LVWVGEAGVRV--YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARR 132 + + + Y + Q K R+ ++ + Sbjct: 65 IFSCDSSHIPCGIYVPFNQHSRFTQTANSQVKWDTAFKNRIWQKIVKQKICNQAEIIKKY 124 Query: 133 SVEQLRGIEGSRVRAT----YALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLY 188 S G++G+ R A + W + + + D N ++ + + Sbjct: 125 SFANYVGLKGTCDRVQSGDKTNCEAFAAKIYWESIFENFQRNKNSDIRNSALNYGYAIVR 184 Query: 189 GVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPD 244 GV ++ ++G+ G H+ +F DI + + V A +I + + + Sbjct: 185 GVVARSLASSGFITCFGVHHSNDLNAFNLADDIIEPFRP-FVDDIAIDIFKNSETSKE 241 >UniRef50_Q9YCL8 Putative CRISPR-associated protein Cas1 n=1 Tax=Aeropyrum pernix RepID=Q9YCL8_AERPE Length = 327 Score = 135 bits (339), Expect = 2e-30, Method: Composition-based stats. Identities = 56/310 (18%), Positives = 100/310 (32%), Gaps = 52/310 (16%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 +I V GA V+ K G + + I+ +S AAVR AA++G LV++ G Sbjct: 10 SRIRVARGALVVETKAGKKVVVESSVERVIISSSRVSISSAAVRAAAKMGIDLVFLDWDG 69 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE--------------------- 121 V A P + + +E LR + ++ Sbjct: 70 SPV-ARLYPPIINKTVATRIGQFSANERLRRLIAAELVSAKIYNQGQTLKYIARQRADER 128 Query: 122 ---------------LRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNGRR 164 LR + ++L IE R + +A+ + ++GR Sbjct: 129 LREAGYEVELLSGEPLRIADEDGPGF-RDKLLSIEARASRRYWQCIAEILPGRLGFSGR- 186 Query: 165 YDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADI 222 D D N ++ LY + E ++L G P +G H+ K S D + Sbjct: 187 ----DRGALDPFNAALNYGYGMLYSIVEKSLLLVGLDPYLGVFHSEKSGKPSLTLDAIEP 242 Query: 223 IKFDTVVP-KAFEI----ARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEI 277 + V A + + G D + R SS ++ + + + + Sbjct: 243 FRAPIVDRILALKAGRMYLKLEAGRLDYKSRKEVAKAVASSLSMKAAVRGLGRRIRLEDA 302 Query: 278 QPPAPPEDAQ 287 A+ Sbjct: 303 IMVQARWLAE 312 >UniRef50_Q1WVJ8 CRISPR-associated protein n=1 Tax=Lactobacillus salivarius UCC118 RepID=Q1WVJ8_LACS1 Length = 301 Score = 134 bits (338), Expect = 4e-30, Method: Composition-based stats. Identities = 46/286 (16%), Positives = 104/286 (36%), Gaps = 32/286 (11%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 ++I Q+ ++ + A ++ +K G IP+ + +++ ++ A V A+ G Sbjct: 5 NVIITQHSKLSYSNNAMIVQNKDG-INQIPLVDMDILLISTTQAVITSALVSKLAESGIK 63 Query: 75 LVWVGEAGVRV-YASGQPGGARSDKLLYQAKLALDEDL--------RLKVVRKMFELRFG 125 +++ V RS Q D R K++ ++ L+ Sbjct: 64 VIFTDNKNEPVTETVNYYPNNRSLDTYLQQYEWNDHVKEILWTKIVRSKIINQIKVLKNY 123 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 + +E L +E + + A++A++Y G +Y KD+ +N ++ S Sbjct: 124 QIDCQDLKIE-LDKLEINDMTNREAVVARKYFEKLFGNKYSRKDFT---PMNAALNYGYS 179 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEP 243 L I++AGY +G H + F + D+ + + A +N E Sbjct: 180 ILLSAVNKEIVSAGYVTYLGIHHQSQENMFNFGSDLMESFRPVVDYWLA----DKNFLEF 235 Query: 244 DREVRLACRDIF-----------RSSKTLAKLIPLIEDVLAAGEIQ 278 +++ D+ + K + + + L ++ Sbjct: 236 TPDIKYGLVDLLNLEIKYNKKQMLLKNAITKYVRDVIEYLNQETVE 281 >UniRef50_A1ZHZ5 Crispr-associated protein Cas1 n=1 Tax=Microscilla marina ATCC 23134 RepID=A1ZHZ5_9SPHI Length = 344 Score = 134 bits (337), Expect = 4e-30, Method: Composition-based stats. Identities = 57/296 (19%), Positives = 94/296 (31%), Gaps = 56/296 (18%) Query: 23 GQIDVIDGAFVLIDKTGIRTH------IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 + V + F + K I +V I L GT VS A+ LA +V Sbjct: 13 AYLQVKNQMFCIRTKNAKTREYHAHAPIAPHTVHRIFLHRGTAVSVDAMHLALINNIDIV 72 Query: 77 WVGEAGVRVYASGQPGGARSDKLL-------------YQAKLALDEDLRLKVV------- 116 + + + K+ K L L ++ Sbjct: 73 LMQHDDHPIGRVWHTKLGSTTKIRKKQLEASLNEVGTAWVKQWLSTKLGNQIAMIEQLKK 132 Query: 117 ----RKMFELRFGEPAPARRS-------------VEQLRGIEGSRVRATYALLAKQYG-- 157 ++ + + A ++ + LRG+EG+ R + L+ Sbjct: 133 HRESKRTYLHEKADKIRALQTSIEALEADHTEAIADTLRGLEGTAGRLYFETLSYVLAKE 192 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSF 215 + GR P D N ++ LY + E A++ AG P +GF+H S Sbjct: 193 YQFAGRSKRP----AHDAFNAFLNYGYGILYSMVEHALVIAGIDPFVGFMHRDGYNQRSM 248 Query: 216 VYDIADIIKFDT--VVPKAFEIARRNPGEPD---REVRLACRDIFRSSKTLAKLIP 266 VYD + + VV K F + N D VRL ++ L K + Sbjct: 249 VYDFIEPYRGHVEQVVVKLFTAKKVNQSHTDAIKDGVRLNEMGKPLLTEALGKYLR 304 >UniRef50_Q8TVS6 Uncharacterized conserved protein n=1 Tax=Methanopyrus kandleri RepID=Q8TVS6_METKA Length = 331 Score = 134 bits (337), Expect = 4e-30, Method: Composition-based stats. Identities = 55/329 (16%), Positives = 102/329 (31%), Gaps = 56/329 (17%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 ++IFL+ G+I+ + AF + ++ I++ G +++ AVRLA + + Sbjct: 7 TVIFLRRGKIERREDAFRI-----GKSKYSAVRTTGIIIAGGAQITTQAVRLALRNEVPI 61 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF----------- 124 V++G + + L Q ++A RL R + Sbjct: 62 VYLGGNRILGVTVPFSE-RYATLRLKQYEIASQPSARLAFARPLIASSILARAAVLEFLA 120 Query: 125 ------------------GEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYD 166 E A S + LRG EG + LA+ W Sbjct: 121 NETGITGLEDAADEVRSEAERALNAGSTDALRGYEGRAACRYFRALAEVLP-DWAF-SGR 178 Query: 167 PKDWEKGDTINQCISAATS-CLYGVTEAAILAAGYAPAIGFVH--TGKPLSFVYDIADII 223 D N IS + L V + +AAG P +GF+H G+ + D+ + Sbjct: 179 RTRRPPRDPFNAAISFGYAGVLLPVLLSRTVAAGLEPFLGFLHGPRGRRPGLILDLMEEW 238 Query: 224 KFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSS-------------KTLAKLIPLIED 270 + V + R + + L+ ++ Sbjct: 239 RALAVDVPVLRRFLDGSLSREM-FRWKGDAVLLRDLDAVSAPVLTVLSRVRGGLLEAVDR 297 Query: 271 VLAA--GEIQPPAPPEDAQPVAIPLPVSL 297 + + +PPE + + V Sbjct: 298 RIREVRDGVSRQSPPEPLEFDPEDVGVVW 326 >UniRef50_B1L400 CRISPR-associated protein Cas1 n=2 Tax=Archaea RepID=B1L400_KORCO Length = 341 Score = 133 bits (336), Expect = 6e-30, Method: Composition-based stats. Identities = 43/258 (16%), Positives = 85/258 (32%), Gaps = 48/258 (18%) Query: 25 IDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + G +++ K G + IP+ SV +++ +S ++ AQ GT L+ G Sbjct: 18 LRKRRGRILILSK-GEKKEIPMKSVKEVVIIGKAALSSELLKALAQSGTDLLIATPTGRP 76 Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMF----------------ELRFGEPA 128 V + + Y+ +L++ +++ R + R E Sbjct: 77 VARLIPAKAGGTARNRYEQYKSLEDRRGIEIARAVIVGKIRNQASNLSYYSKARRMDEEL 136 Query: 129 PARR------------------------SVEQLRGIEGSRVRATYALLAKQYGVTWNGRR 164 + + +++ E + +A W R Sbjct: 137 SSELYDAAQQLKREMEELKNEEFPDIDEARKRIMARESKCANIYWEKIASIM-EEWKFRG 195 Query: 165 Y-DPKDWEKG--DTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDI 219 D E D +N C++ + L +L G P +G++H +P S VYD+ Sbjct: 196 REKRTDLEGNVIDPVNLCLNVCYNLLSAQIWKNVLRFGLDPFLGYLHVERPGRISLVYDL 255 Query: 220 ADIIKFDTVVPKAFEIAR 237 + + V F R Sbjct: 256 MEPFRP-MVDRFVFSYLR 272 >UniRef50_Q5X8T5 Putative uncharacterized protein n=1 Tax=Legionella pneumophila str. Paris RepID=Q5X8T5_LEGPA Length = 330 Score = 133 bits (336), Expect = 6e-30, Method: Composition-based stats. Identities = 39/261 (14%), Positives = 75/261 (28%), Gaps = 44/261 (16%) Query: 26 DVIDGAFVLIDKTGIRTHIPVGSVACI-MLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVR 84 + ++ +K + + + I +++ G ++ + A G + + +G Sbjct: 14 KLQGERILVFNKDQMVFEGTLNRLKTISIVKQGVTLTSNLLVACAMRGIQIFILDYSGKA 73 Query: 85 VYASGQPGGARSDKLLYQAKL-----ALDEDLR-------------LKVVRKMFELRFGE 126 V + + Q K A + R + + Sbjct: 74 VCSLQGGLHRTAKVRESQFKYIYTKEASNMAARFILAKIKKQSAVLKYFAKAKNRPENEK 133 Query: 127 PAPARRSVE---------------------QLRGIEGSRVRATYALLAKQYGVTWNGRRY 165 S + L G EGS R + + + V + Sbjct: 134 IILHAASQKMAGIVANITETNWIKLQDWRFTLLGFEGSAARIYWETVKEVNLVCSDFEGR 193 Query: 166 DPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADII 223 P D N ++ A S L AI AG G +HT KP S V D+ + Sbjct: 194 RP--RVNKDITNSMLNYAYSILSSWVWQAITNAGLELYAGILHTSKPGKPSLVLDLMEEF 251 Query: 224 KFDTVVPKAFEIARRNPGEPD 244 + F++ R + + Sbjct: 252 RPWCADRIIFKLHARAQKQNE 272 >UniRef50_C9RJP2 CRISPR-associated protein Cas1 n=21 Tax=cellular organisms RepID=C9RJP2_FIBSS Length = 298 Score = 132 bits (332), Expect = 2e-29, Method: Composition-based stats. Identities = 34/279 (12%), Positives = 82/279 (29%), Gaps = 37/279 (13%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVG 72 + ++ F + + D V+ + + +A I+L P V++A + + Sbjct: 3 KRTLYFGNQAYLSLKDNQLVIKKRNDEIVTAAIEDIAYIVLDSPQITVTNALLGALLENN 62 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF-------- 124 ++ + + G +QA++ L+ ++ ++ + + Sbjct: 63 CAIINCDKTHLPSGLLLPLSGNTLQSERFQAQIDASLPLKKQLWQQTVQQKILNQAAVLH 122 Query: 125 ----GEPAPARRSVEQLR-----GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 E +R E + + R P Sbjct: 123 GSHDAEIGNMTAWANSVRSGDVDNREAVAAAYYWKEMFPDIPDFVRDRNGVPP------- 175 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKAF 233 N ++ + L GV A++++G P +G H + + DI + + V Sbjct: 176 -NNMLNYGYAILRGVVARALVSSGLLPTLGIHHHNRYNAYCLADDIMEPYRP-IVDKLIL 233 Query: 234 EIARRNPGEPD--------REVRLACRDIFRSSKTLAKL 264 E+ P + +R+ D + Sbjct: 234 EVINEIEEYPTDLSTEIKAKLLRIPVLDCVIDGNRSPLM 272 >UniRef50_A2SQK9 Uncharacterized protein predicted to be involved in DNA repair-like protein n=1 Tax=Methanocorpusculum labreanum Z RepID=A2SQK9_METLZ Length = 310 Score = 131 bits (331), Expect = 2e-29, Method: Composition-based stats. Identities = 43/277 (15%), Positives = 88/277 (31%), Gaps = 34/277 (12%) Query: 33 VLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRV---YASG 89 +++ + G T P+ + +++ G + + + A G + + G V Y G Sbjct: 21 LIVRQKGSTTQYPLDDMRHLLIAGGHSLHTSVLERLADRGIAVSFFTAHGKPVGGIYGKG 80 Query: 90 QPGGARSDKLL-----YQAKLALDEDLRLKVVRKMFELR------FGEPAPARRSVEQL- 137 P A + + A + D RL+ + ++ E GE + E+L Sbjct: 81 APSLASQQRDIPIHKFAMASIRSSLDERLRYINELAEFDPEGLYFKGEFDILTAAREELE 140 Query: 138 -------RGIEGSRVRA-TYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 G S + Y ++ ++ R D +N +S + LY Sbjct: 141 YLITLPEIGRAFSLTKTMYYEIIGRKLPKVLGYR--RRCQPPFMDPVNVMMSHGYAVLYA 198 Query: 190 VTEAAILAAGYAPAIGFVHT------GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGE- 242 A AG + G ++ G V D+ + V ++A + Sbjct: 199 NFALACTGAGLDLSRGALYGEIVSAPGGRGGCVLDLMEPATVSMVDRVIIQMAAEGRLDG 258 Query: 243 -PDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQ 278 + R + + + +L I L + Sbjct: 259 AYEVTTRCLLSNE-LKEEFMKRLHGSINITLIEENVN 294 >UniRef50_A2BKJ8 Universally conserved protein n=1 Tax=Hyperthermus butylicus DSM 5456 RepID=A2BKJ8_HYPBU Length = 331 Score = 131 bits (331), Expect = 2e-29, Method: Composition-based stats. Identities = 47/279 (16%), Positives = 83/279 (29%), Gaps = 49/279 (17%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGV 83 +I V G +G + + + ++ VS A+R A++G LV +G+ G Sbjct: 13 RIYVRRGVVYAEAPSGEKAVVTADTELVVLATGSVSVSGRALRRLAELGVRLVVLGQRGQ 72 Query: 84 RVYASGQPG--GARSDKLLYQAKLALDEDLRLKVVRKMFEL------------------- 122 V + + Q ++ + ++ Sbjct: 73 VVAEHRPVDRVNRTIEARMEQYRVKATGEALYYAAEMVYAKIVNQAKLLRYLAKSRREPW 132 Query: 123 -------------RFGEPAPAR--RSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDP 167 R + + E +R IE R + +A+ + GR Sbjct: 133 LRDAGYRVEGHADRLRQIIENEEPTTPEVIRSIEAQAARDYWDAIAQIAPTPFPGR---- 188 Query: 168 KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKF 225 D +N +S + LY + A+ AG P GF+H + S YD AD K Sbjct: 189 -QPRGEDHLNMALSYGYAILYSIAHDALTVAGLDPYAGFLHADRSGRPSLTYDYADTYKP 247 Query: 226 DTVVPKAFEIARR------NPGEPDREVRLACRDIFRSS 258 V R+ G R + + Sbjct: 248 IAVDKPLLTAPRKTDCLDTYMGALTYNARRCIATLVLEN 286 >UniRef50_B2KB47 CRISPR-associated protein Cas1 n=2 Tax=Elusimicrobia RepID=B2KB47_ELUMP Length = 298 Score = 131 bits (331), Expect = 2e-29, Method: Composition-based stats. Identities = 25/225 (11%), Positives = 69/225 (30%), Gaps = 31/225 (13%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVGEAG 82 + V + F + + H + I+L +S+ ++ + +++ + Sbjct: 13 HLCVKNNNFSAVKDREEKLHCLFDDINSIILYGNNITISNTCIQKCLEHKVPVIFCDKTY 72 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRG--- 139 Q ++ + + +++ + A + + L+ Sbjct: 73 NPAGMLLSSFTTNIYGRRLQLQINASKPQIKQAWQQIITSKLNNQAEVLKRFDTLKAAET 132 Query: 140 ---------------IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 EG + + L + ++ + D IN ++ Sbjct: 133 IFNMAREVRSGDATFKEGVGAKVYFENLFNDFH----------RNTDDKDIINSALNYGY 182 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDT 227 + + A+++AG PAIG H+ F D+ + ++ Sbjct: 183 AIVRSSIARAVVSAGLNPAIGIFHSKNHNPFCLIDDLIEPLRPLI 227 >UniRef50_B9M9X7 Putative uncharacterized protein n=1 Tax=Diaphorobacter sp. TPSY RepID=B9M9X7_DIAST Length = 302 Score = 131 bits (330), Expect = 3e-29, Method: Composition-based stats. Identities = 38/264 (14%), Positives = 85/264 (32%), Gaps = 25/264 (9%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVGTL 74 S++ + ++ A V+ + +P +A I+L ++H + + G Sbjct: 5 SIVINRPAKLKREHFALVVEQEQ--SARVPFEDIAVIVLNHREITLTHPVLSACGEYGIG 62 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV 134 L G+ + + +L LD+ + + +++ G A Sbjct: 63 LYSTGDNHQPNGVFMPFLQHSRATRMQRLQLDLDKPSAKRAWAHIVQVKIGNQARCM--- 119 Query: 135 EQLRGIEGSRVRATYA-------------LLAKQYGVTWNGRRYDPKDWEKGDTINQCIS 181 +L G G+ A+YA + Y GR + N + Sbjct: 120 -ELLGTVGTDRLASYARRVRSGDGGNLEAQASAYYFPQVFGRSFHRSQTGWS---NAALD 175 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRN 239 + + G A++A G P++G H + +F D+ + + + A Sbjct: 176 YGYAVMRGACARALVAHGMLPSLGLFHRSEQNAFNLADDLIEPYRPVVDLHVAQHRPADE 235 Query: 240 PGEPDREVRLACRDIFRSSKTLAK 263 E ++A + + + Sbjct: 236 DAELQPSDKVALVGLLNVDVAMPR 259 >UniRef50_C7V674 Predicted protein n=2 Tax=Enterococcus faecalis RepID=C7V674_ENTFA Length = 304 Score = 131 bits (329), Expect = 3e-29, Method: Composition-based stats. Identities = 40/294 (13%), Positives = 92/294 (31%), Gaps = 46/294 (15%) Query: 17 MIFLQYG-QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTL 74 +F++ G ++ V +I + G IP+ + ++LE T ++ + AQ Sbjct: 5 TVFVKNGEKLKVKLDNLEVIKE-GNTYVIPLTDIESVILEGDQTVITTRLLAKFAQHHID 63 Query: 75 LVWVGEAGVRV--YASGQPGGARSDKLL------------YQAKLALDE-DLRLKVVRKM 119 V + V + + + + ++ + ++ V + + Sbjct: 64 TVICDNTFMPVGVFLGIGQYHRSAKRAIWQSNWTEEHKQVAWCEIVTQKIQNQIAVAKYL 123 Query: 120 FELRFGEPAPARRSVEQLRG----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 + S L G EG + + L YGV + E+ Sbjct: 124 GTDSERVEVLEKLSEGILPGDTTNREGHVAKVYFHSL---YGVGFT--------REEECL 172 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAF 233 N C++ + + ++A G P +G H + SF D+ + + Sbjct: 173 PNACMNYGYAVIRAQMARCVVALGLLPMLGIFHKNEYNSFNLVDDLMEPFRPLMDWYIHQ 232 Query: 234 EIARRNPGEPDREVRLACRDIFRSS-----------KTLAKLIPLIEDVLAAGE 276 I ++N RL + + +++ + + G+ Sbjct: 233 TILKKNEKYLTYHSRLTLVEFLHQKIKVKNKKIYMNQAMSEYVAAFVKAMETGD 286 >UniRef50_Q13CC1 CRISPR-associated protein, Cas1 family n=2 Tax=Rhodopseudomonas palustris RepID=Q13CC1_RHOPS Length = 299 Score = 130 bits (328), Expect = 5e-29, Method: Composition-based stats. Identities = 45/279 (16%), Positives = 87/279 (31%), Gaps = 47/279 (16%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTR 59 M W L+ Q ++ + D + G + + +A I++ P Sbjct: 1 MAWRGLHLT-----------QAARLSLADSQVCVKQDAGE-VRLALEDIAWIVIDTPQAT 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYAS--GQPGGARSDKLLYQAKLALDEDLRLKVV- 116 ++ A + + G +LV+ E + + Q RL Sbjct: 49 LTSALMSACMEAGIVLVFTDERHTPSGMALPFHRHHRQGGIARLQMDAKDGVKKRLWQAI 108 Query: 117 --RKMFELRFGEPAPARRSVEQLR------------GIEGSRVRATYALLAKQYGVTWNG 162 RK+ R + E LR +E R + L + + Sbjct: 109 IRRKILNQAGSLAVLDRNNAETLREIARHVEPGDPENVEARAARFYWGRLFEDF------ 162 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIA 220 + GD N+ ++ + + A++A+G+ PA G H G +F D+ Sbjct: 163 -----VRDDDGDLRNKMLNYGYAVVRAGVARALVASGFLPAFGLKHDGAANAFNLADDLV 217 Query: 221 DIIKFDTVVPKAFEIAR---RNPGEPDREVRLACRDIFR 256 + + V A++ G+ E R A + Sbjct: 218 EPFRP-FVDVLAWKTLGDRVDRKGDLTLEDRRAMAGVLL 255 >UniRef50_B2UP48 CRISPR-associated protein Cas1 n=1 Tax=Akkermansia muciniphila ATCC BAA-835 RepID=B2UP48_AKKM8 Length = 311 Score = 130 bits (327), Expect = 5e-29, Method: Composition-based stats. Identities = 38/255 (14%), Positives = 66/255 (25%), Gaps = 34/255 (13%) Query: 23 GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWV--- 78 + G D IP+ V ++L ++ + A+ V Sbjct: 13 CHLSCDKGQLRCADGENSPRTIPLEDVGAVVLSSFKATLTSNLLIELARKRIGFVLCESY 72 Query: 79 ------------GEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE 126 + G+ + + P R+ L E Sbjct: 73 RPAVLLLPADRSTDTGLLRHLADMPARLRNRLWQKTLDAKCGNQTALAQAWNPHHPAIAE 132 Query: 127 PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSC 186 + + R E R +++ A TW + E+G N + A + Sbjct: 133 LKRMAVTEKTAR--EAECARLFWSVFAD----TWANSDFRRGRHEEG--FNNLFNYAYAI 184 Query: 187 LYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDT---VVPKAFEIARRNP- 240 L + A G P G H + + YD+ + + V R Sbjct: 185 LLSCILQYLFALGLDPCFGIFHQSREHAAPLAYDLMEPFRPAFDANVARWIHLCLREGKT 244 Query: 241 ----GEPDREVRLAC 251 GE RE R Sbjct: 245 EERAGEITREFRQHI 259 >UniRef50_A8TI31 CRISPR-associated protein Cas1 n=1 Tax=Methanococcus voltae A3 RepID=A8TI31_METVO Length = 235 Score = 130 bits (326), Expect = 7e-29, Method: Composition-based stats. Identities = 30/203 (14%), Positives = 62/203 (30%), Gaps = 27/203 (13%) Query: 98 KLLYQAKLALDEDLRLKVVRKM--------------FELRFGEP-----APARRSVEQLR 138 ++ Q++ L+ D R + ++ ++L + + + + Sbjct: 1 MIVQQSENYLNSDKRHYIAKEFVKGSILNISKNLSRYKLDYDKEKYITALQKVNRITDIM 60 Query: 139 GIEGSRVRATYALLAKQYG-VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILA 197 +E Y + + R P + E IN IS S LY + I Sbjct: 61 NVEALMRNDYYNNFDNIFKNFKYEKRSRRPPENE----INALISFGNSLLYSTVISEIFN 116 Query: 198 AGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIF 255 P++ ++H + S DIAD+ K V F + + + Sbjct: 117 THLNPSVSYLHEPYERRYSLALDIADVFKPIFVDRLIFNLVNKKIIN-ENHFEKDLNSCL 175 Query: 256 RSSKTLAKLIPLIEDVLAAGEIQ 278 + + A + + L Sbjct: 176 LNDEGRAIFLSKYNERLQKTIKH 198 >UniRef50_A8LN06 CRISPR-associated protein Cas1 n=2 Tax=Rhodobacterales RepID=A8LN06_DINSH Length = 303 Score = 129 bits (324), Expect = 1e-28, Method: Composition-based stats. Identities = 43/262 (16%), Positives = 84/262 (32%), Gaps = 30/262 (11%) Query: 13 DRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQV 71 D++ I + G + + IP+ +A +++ GT + + + A Sbjct: 2 DQIVDIATDGRHLSRDRGFLKVSEGAREIGRIPLDQIAGVIVHAHGTTWTTSLLTELADR 61 Query: 72 GTLLVWVGEAGVRVYASGQP--GGARSDKLLYQ----AKLALDEDLRLKVVR-KMFELRF 124 G +V G A+ +L Q A L + + + M Sbjct: 62 GAPVVLCGANHAPRSVLMPLDGHHAQGARLRAQWQARAPLVKQAWKQTVIAKIAMQAAAL 121 Query: 125 GEPAPARRSVEQLR---------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 V L +E R + + G + +D D Sbjct: 122 EAMGEPHAPVGMLARKVTSGDATNVEAQAARLYW---PRMMGTEF------RRDRTAPD- 171 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAF 233 +N ++ + L T A++AAG P IG H+ + +F D+ + + V Sbjct: 172 LNALLNYGYTVLRAATARAVVAAGLHPTIGLHHSNRGNAFALADDLMEPFRP-LVDCCVR 230 Query: 234 EIARRNPGEPDREVRLACRDIF 255 +A RN + D + + + Sbjct: 231 GLAARNGPQVDPAAKQSLARLI 252 >UniRef50_A6DE65 Putative uncharacterized protein n=1 Tax=Caminibacter mediatlanticus TB-2 RepID=A6DE65_9PROT Length = 267 Score = 128 bits (323), Expect = 2e-28, Method: Composition-based stats. Identities = 30/239 (12%), Positives = 74/239 (30%), Gaps = 41/239 (17%) Query: 65 VRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF 124 + + + +V + + Q ++ + + R++ R+ + + Sbjct: 2 IYALQKRKISIEFVDYKYNPYAMIYGIELSYPKIAIKQLEIV-NSEKRIEFAREFVKAKI 60 Query: 125 GEPAP---------------------------ARRSVEQLRGIEGSRVRATYALLAKQYG 157 +++L G EGS + ++K Sbjct: 61 VNQRNYLKIMSKYHKNLEENIIKIDKIKSKIKNANKIDELMGYEGSIANIYWNGISKIL- 119 Query: 158 VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT-GKPLSFV 216 N + D IN ++ + LY + A++ AG I F+HT K V Sbjct: 120 ---NEEDFKRITKGATDRINTALNYGYAILYNKVQKALIKAGLGINISFLHTFEKKPVLV 176 Query: 217 YDIADIIKFDTVVPKA-FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAA 274 +D + + V F + + + + + + + +++ + + LA Sbjct: 177 FDFIEQFRCVAVDKAICFSLKKSDDIDVN-------NKGMLTREAKRRIVEEVNERLAT 228 >UniRef50_A6VLA8 CRISPR-associated protein Cas1 n=13 Tax=Proteobacteria RepID=A6VLA8_ACTSZ Length = 305 Score = 128 bits (322), Expect = 2e-28, Method: Composition-based stats. Identities = 35/318 (11%), Positives = 91/318 (28%), Gaps = 57/318 (17%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TR 59 MTW I + G++ + ++ + G +P+ +A +++E T Sbjct: 1 MTW---RSILMSK--------GGKLSLQQNQMLIQ-QEGNEFTVPLEDIAIVVVESRETV 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQ--PGGARSDKLLYQAKLALDEDLRLKVVR 117 ++ + G + E + + L Q + +L + +L Sbjct: 49 ITIPLLSAFGLHGVTFLTCDEQFLPCGQWLPFNQYHRQLKTLKLQLEASLPQKKQLWQAI 108 Query: 118 KMFELR----------FGEPAPARRSVEQLR------GIEGSRVRATYALLAKQYGVTWN 161 ++R F + + + +E + L Sbjct: 109 VQQKIRNQAGVLKICKFQAESDRLLKMAEKVKSGDKENLEAQSAVIYFQTL--------F 160 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DI 219 G+ + + E +N ++ + + A++ G+ P IG H + +F D Sbjct: 161 GKGFKRSEDES--AVNSALNYGYTVMRSAVARALVLYGWLPQIGLFHRSELNAFNLADDF 218 Query: 220 ADIIKFDTVVPKAFEIARRNPG--EPDREVRLACRDIFRSS-----------KTLAKLIP 266 + + V ++ + ++ + + + + Sbjct: 219 IEPFRP-LVDLLVVQLENEDKLSANLSPILKQRLIKVLNYQLLFKQEKVNALTAIERSVG 277 Query: 267 LIEDVLAAGEIQPPAPPE 284 + L+ PE Sbjct: 278 SFQTALSQRNADLLKLPE 295 >UniRef50_C0WRP8 CRISPR-associated protein n=18 Tax=Lactobacillaceae RepID=C0WRP8_LACBU Length = 327 Score = 127 bits (319), Expect = 6e-28, Method: Composition-based stats. Identities = 45/252 (17%), Positives = 88/252 (34%), Gaps = 21/252 (8%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 S+I Q+ ++ ++ G IP+ ++ +++ ++ A + AQVGT Sbjct: 31 SVIITQHAKLSYSSHMMIVQTNDG-INQIPIDDISILLISTTRAVITTALISELAQVGTK 89 Query: 75 LVWVGEAGVRV--YASGQPGGARSDKLLYQAKLALDEDLRLK---VVRKMFE----LRFG 125 +++ A + P L Q L V KM L F Sbjct: 90 VIFTDGANQPICETVGYYPNNRSVKLLQEQVNWDEQRKEVLWTKIVASKMINQVNVLTFY 149 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 + + E+L +E + A++A++Y + ++ IN + S Sbjct: 150 KKDTTEVN-EELAKLEVADPSNREAVVARKYFPLLFNNDFSRRNGSA---INAALDYGYS 205 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEP 243 L I++ GY IG H G+ F D+ + + F IA + + Sbjct: 206 ILLSSINQEIVSNGYLTYIGIHHRGEDNPFNLGSDLMEPFRPVVD----FWIASQKFDQL 261 Query: 244 DREVRLACRDIF 255 +V+ + Sbjct: 262 TPDVKYGLVQLL 273 >UniRef50_B4AQ39 Crispr-associated protein Cas1 n=5 Tax=Francisella RepID=B4AQ39_FRANO Length = 334 Score = 126 bits (317), Expect = 9e-28, Method: Composition-based stats. Identities = 38/259 (14%), Positives = 74/259 (28%), Gaps = 40/259 (15%) Query: 19 FLQYGQIDVIDGAFVLIDK--TGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLV 76 ++ + G V+ DK ++T + V V + + ++ + + LV Sbjct: 17 IFDGVKLSLSLGNIVIKDKETDEVKTKLSVHKVLALFIVGNMTMTSQLLETCKKNAIQLV 76 Query: 77 WVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE---------- 126 ++ + G A Q + + + + R + Sbjct: 77 FMKNSFRPYLCFGDIAEANFLARYKQYSVVEQD---ISLARIFITSKIRNQHNLVKSLRD 133 Query: 127 -----------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 S+ +L GIEG+ + + +W GR K Sbjct: 134 KTPEQQEIVKKNKQLIAELENTTSLAELMGIEGNVAKNFFKGFYGHLD-SWQGR----KP 188 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDT 227 K D N + S L+ E + G+ GF H K S V D + + Sbjct: 189 RIKQDPYNVVLDLGYSMLFNFVECFLRLFGFDLYKGFCHQTWYKRKSLVCDFVEPFRC-I 247 Query: 228 VVPKAFEIARRNPGEPDRE 246 V + + + Sbjct: 248 VDNQVRKSWNLGQFSVEDF 266 >UniRef50_C6MJ62 CRISPR-associated protein Cas1 n=5 Tax=Nitrosomonas sp. AL212 RepID=C6MJ62_9PROT Length = 430 Score = 125 bits (313), Expect = 2e-27, Method: Composition-based stats. Identities = 40/264 (15%), Positives = 86/264 (32%), Gaps = 40/264 (15%) Query: 18 IFLQYGQIDVIDG-AFVLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLL 75 + ++ GQ+ + G + + + R V +V I++ T +S AA++ + L Sbjct: 122 LQVEKGQLVLRSGYSCSTVTEREKRITRGVHAVRAIIVINTTGNLSTAAIQWCSDQRIAL 181 Query: 76 VWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRF----------- 124 + + P Q + L + +++ +F Sbjct: 182 YVLDRDAHLTALTHAPTPHVVSLRRLQYSV-----ETLTLAKEILYRKFDACIKCMPILA 236 Query: 125 ------GEPAPARRSVEQLRGIEGSRVRATYALLA--------KQYGVTWNGRRYDP--- 167 + ++E++R IE ++ K + W G Sbjct: 237 QSFTQSADSILYANTLEEMRLIEARAAIIYWSTWNNCVIKWKEKDVPIEWRGFSQRASGI 296 Query: 168 --KDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADII 223 K ++ +N ++ A + L G E A+ G A+G +H + S VYD+ + + Sbjct: 297 SGKGYKATHPVNAILNYAYAILAGQVERALQIVGLDVAVGSLHADQDGRASLVYDLMEPL 356 Query: 224 KFDTVVPKAFEIARRNPGEPDREV 247 + + F V Sbjct: 357 RP-VIDKIIFAWVTSQQWRRADFV 379 >UniRef50_Q03LF6 CRISPR-associated protein, Cas1 family n=6 Tax=Streptococcus RepID=Q03LF6_STRTD Length = 303 Score = 124 bits (312), Expect = 3e-27, Method: Composition-based stats. Identities = 38/278 (13%), Positives = 90/278 (32%), Gaps = 44/278 (15%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTR 59 MTW ++ + ++ + + +L+ K G +P+ ++ I+ T Sbjct: 1 MTWRVVHVSQSEKM---------RLKLDN---LLVQKMGQEFTVPLSDISIIVAEGGDTV 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVR--VYASGQPGGARSDKLLYQAKLALDEDLRLKVVR 117 V+ + ++ LV + +Y S +L Q + + + + Sbjct: 49 VTLRLLSALSKYNIALVVCDNEHLPTGIYHSQNGHFRAYKRLKEQLDWSQKQKDKAWQIV 108 Query: 118 KMFELRFGEPAPA--RRSVEQLR---------------GIEGSRVRATYALLAKQYGVTW 160 +++ E A +S++ +R EG + + L Sbjct: 109 TYYKINNQEDVLAMFEKSLDNIRLLSDYKEQIEPGDRTNREGHAAKVYFNEL-------- 160 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYD 218 G+++ ++ D IN ++ + + + G +G H + V D Sbjct: 161 FGKQFVRVTQKEADVINAGLNYGYAIMRAQMARIVAGYGLNGLLGIFHKNEYNQFNLVDD 220 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFR 256 + + + V ++ R E RL D+ Sbjct: 221 LMEPFR-QIVDVWVYDNLRDQEF-LKYEYRLGLTDLLN 256 >UniRef50_C3WD45 CRISPR-associated protein cas1 n=1 Tax=Fusobacterium mortiferum ATCC 9817 RepID=C3WD45_FUSMR Length = 327 Score = 123 bits (309), Expect = 7e-27, Method: Composition-based stats. Identities = 40/295 (13%), Positives = 93/295 (31%), Gaps = 45/295 (15%) Query: 18 IFLQY--GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLL 75 +F+Q +I FV+ K + + + I+++ G +++ + + L Sbjct: 3 LFVQKNGSKIYREKEYFVVERKEEGKEYFSYNLIDNIIIQEGNQLTSDFLLEIIEKDIPL 62 Query: 76 VWVGEAGV-RVYASGQPGGARSDKLLYQAKLALDE--------------DLRLKVVRKMF 120 + G R + S+ Q +L + E + + K ++K++ Sbjct: 63 YLGDKYGNIRGKFTPITYNTNSNIRERQYQLWIREYGKELGKSWIMEKIENQKKHIQKIY 122 Query: 121 ELR------------FGEPAPARRSV--------EQLRGIEGSRVRATYALLAKQYGVTW 160 R F E +++ ++ G EG Y + K W Sbjct: 123 SRRGVLDKFLEIERKFDENITKIKNITWNDKDFENRVMGYEGRSSILYYEEIKKFLPENW 182 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYD 218 + + + N ++ A LY E + AG +G +H+ S ++D Sbjct: 183 SF--TKRETQGAKEPYNIVLNYAFGILYFKLERYLTLAGLDIQLGIIHSNNNKSNSLIFD 240 Query: 219 IADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 + + AF + + + + + + + + L Sbjct: 241 FIEPFR-ILAWECAFSLFSKKELNKNYF---NLNEGKIELEGKRVIAKDLYNRLK 291 >UniRef50_B8I084 CRISPR-associated protein Cas1 n=1 Tax=Clostridium cellulolyticum H10 RepID=B8I084_CLOCE Length = 298 Score = 123 bits (309), Expect = 7e-27, Method: Composition-based stats. Identities = 27/251 (10%), Positives = 73/251 (29%), Gaps = 35/251 (13%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 ++I ++ + + G IP+ + I+L+ ++ A + A+ Sbjct: 5 NIIVSNPTKLKLKQNNLWVEQSDG--FSIPIDDINTIVLDSADVTITSALLSKLAEEDIA 62 Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP-----AP 129 L + + ++ L + + +++ + + Sbjct: 63 LYSCDGKHTPNGVLLPFSCHSRQYKIVKTQINLSAPFKKRCWQRVVQQKIENQAFCLNIL 122 Query: 130 ARRSVEQLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTIN 177 + ++L + E + +++L + D N Sbjct: 123 ELKGRDELINLSKSVLSGDSTNVEAHAAKYYFSVLFTNF------------KRGMQDNTN 170 Query: 178 QCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEI 235 ++ S L G I + G+ P+IG H + +F D + + V + Sbjct: 171 YALNYGYSILRGAVARTIASYGFIPSIGIHHRSELNNFNLADDFIEPFRP-IVDMWVKQN 229 Query: 236 ARRNPGEPDRE 246 + + Sbjct: 230 INEDTLLTPKH 240 >UniRef50_C6HZN2 CRISPR-associated protein Cas1 n=1 Tax=Leptospirillum ferrodiazotrophum RepID=C6HZN2_9BACT Length = 245 Score = 123 bits (308), Expect = 1e-26, Method: Composition-based stats. Identities = 42/245 (17%), Positives = 82/245 (33%), Gaps = 46/245 (18%) Query: 15 VSMIFLQYG--QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVG 72 ++ ++L +DV +G+ +L ++ IP+ + +++ +S A + ++ G Sbjct: 1 MATLYLDRKGLDLDVENGSLILREEGERIRSIPLTFLDRVVIRANISLSSAVLGELSESG 60 Query: 73 T-LLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPAR 131 T +V G G +V + L Q L D LR + ++ + A + Sbjct: 61 TETVVLSGRQGRKVARIEGSRHNDARIRLRQYALFHDTTLRKRWAGRLVRSKIRSQARSL 120 Query: 132 RSV-----------------------------------EQLRGIEGSRVRATYALLAKQY 156 ++ +L G+EG + +K + Sbjct: 121 GTILKVRPDLTSSLLRPMEQLQSIGEKIRERTLDPFEISELLGLEGGAGSLYFESFSKAF 180 Query: 157 G--VTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KP 212 + + R P D N +S A + L+ G P IGF H Sbjct: 181 PPSLGFTSRNRRPPR----DPANAVLSLAYTLLHFDGVRTANMVGLDPLIGFYHEPAYGR 236 Query: 213 LSFVY 217 S V+ Sbjct: 237 DSLVF 241 >UniRef50_C0WE67 Crispr-associated protein n=1 Tax=Acidaminococcus sp. D21 RepID=C0WE67_9FIRM Length = 290 Score = 123 bits (308), Expect = 1e-26, Method: Composition-based stats. Identities = 35/281 (12%), Positives = 87/281 (30%), Gaps = 41/281 (14%) Query: 14 RVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVG 72 +++ Q ++D + + I + + +++E G ++ + Sbjct: 3 YRTIVITQRCKLDFCMNYVEVQTAVSKK-RIFIDEIKTLIIENTGVAITAYLLSELMNRK 61 Query: 73 TLLVWVGEAGVRVYASGQPGGA-RSDKLLYQAKLALDEDL-----RLKVVRKMFELRFGE 126 L++ +A + S + + Q + E + + + F + Sbjct: 62 VNLIFCDKAHNPQSSLLPLHARFDSIRKIKQQMIWSQEIKDEVWDCIVKAKIRQQALFLD 121 Query: 127 PAPARRSVEQLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGD 174 + + L EG + + L +G ++ + Sbjct: 122 ELEKKEQSKMLMSYLDDVVSADAHNREGHAAKVYFNAL---FGNSFT--------RDLDS 170 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKA 232 IN ++ S L + I++AGY +G H P +F D+ + + V Sbjct: 171 PINAGLNYGYSLLLSLFNREIVSAGYLTQLGIFHENTYNPYNFSCDLMEPFR-ILVDRYV 229 Query: 233 FEIARRNPGEPDREVRLACRDIFRS----SKTLAKLIPLIE 269 + NP ++ + ++F+ + L I Sbjct: 230 Y---NMNPTTFAKDEKREIINLFQEILYIDDSRQFLANAIN 267 >UniRef50_C7G696 CRISPR-associated protein Cas1 n=2 Tax=Roseburia RepID=C7G696_9FIRM Length = 301 Score = 122 bits (306), Expect = 2e-26, Method: Composition-based stats. Identities = 40/270 (14%), Positives = 84/270 (31%), Gaps = 37/270 (13%) Query: 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TRVSHAAVRLAAQVGTLLVWVG 79 + ++ + + + I IP+ + CI++E VS ++ A +G + Sbjct: 10 SHVKLSIKNQQLNIETD--IARQIPLEDINCIIIENQTVTVSAYLLQKMADMGIAVYVCD 67 Query: 80 EAGVRVYASGQP--GGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSV--- 134 E + L YQ + RL + ++R A + Sbjct: 68 EKHLPNAVLLPMVRHSRHFKILKYQIEAGKPLQKRLWQQIVVQKIRNQALCLAYLELDGS 127 Query: 135 EQLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISA 182 E+L + E + L YG+ ++ IN ++ Sbjct: 128 EELMKMCKEVQSGDRTHVEAKAAAFYFKSL---YGLGFS--------RGNDHIINAALNY 176 Query: 183 ATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEI-ARRN 239 + + G+ +I+ G P+IG H + +F D+ + + + A Sbjct: 177 GYAIVRGLIARSIVCYGLEPSIGVFHHSELNNFNLADDMIEPFRPLVDLYVAQNYDIAEI 236 Query: 240 PGEPDREVRLACRDIFRSS---KTLAKLIP 266 + E + I K ++I Sbjct: 237 DSDLTPERKRGIFGIINYDMDMKGEKRIIS 266 >UniRef50_A9GQD8 Putative uncharacterized protein n=1 Tax=Sorangium cellulosum 'So ce 56' RepID=A9GQD8_SORC5 Length = 310 Score = 121 bits (305), Expect = 2e-26, Method: Composition-based stats. Identities = 36/230 (15%), Positives = 60/230 (26%), Gaps = 40/230 (17%) Query: 85 VYASGQPGGARSDKLLYQAKLALDEDLRLKVVR-----KMFELR---------------- 123 + + + + Q + A DE L R K+ R Sbjct: 50 LGRATGLESRNVELRVAQHRAASDEAFCLSFARGVVVSKIKNARTMLRRNHAAPEVAVLS 109 Query: 124 ----FGEPAPARRSVEQLRGIEGSRVRATYALLAKQYG--------VTWNGRRYDPKDWE 171 A S+ L GIEG+ R + A GR P Sbjct: 110 ELDQLARKAAEAPSLPSLLGIEGTAARVYFGAFAGMLKGAGEARGEFDLEGRNRRPPR-- 167 Query: 172 KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK--PLSFVYDIADIIKFDTVV 229 D +N +S A + L + G P +GF H + + D+ + + Sbjct: 168 --DPVNALLSLAYALLAKDLATTLGTVGLDPLLGFYHQPRFGRPALALDLIEEFRPIVAD 225 Query: 230 PKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGEIQP 279 PD + + + K + E + P Sbjct: 226 SVVVAAINNGVVAPD-DFQRFGGAVALRPAGRKKFLQAYERRMDQLVTHP 274 >UniRef50_B3W9S5 CRISPR-associated protein n=4 Tax=Lactobacillus RepID=B3W9S5_LACCB Length = 301 Score = 120 bits (300), Expect = 8e-26, Method: Composition-based stats. Identities = 32/262 (12%), Positives = 83/262 (31%), Gaps = 35/262 (13%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 S+I Q+ ++ + V+ G ++P+ + ++ +S A+ A+ Sbjct: 5 SLIVTQHCKVTTKNRTLVVQT-DGEVNNVPIEDINQVVFTTTRALLSADAITTLAEANAK 63 Query: 75 LVWVGEAGVRVYASGQPGG--ARSDKLLYQA----KLALDEDLRLKVVRKMFELRFGE-- 126 +++ G G V + ++D + Q L + ++ + + + + Sbjct: 64 VIFSGRDGQPVTETTNLYSDRRKADLVRLQVNWPKSLVENLWTKIVAAKVSNQAQVTKLC 123 Query: 127 --------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQ 178 + E + R + L+ +G ++ N Sbjct: 124 GFDNQSLLDDLDTLEINDRSNREATAARKYFKLI---FGDDFS--------RSDICATNA 172 Query: 179 CISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIA 236 ++ S L T AI++AG+ IG H+ + D+ + + A Sbjct: 173 ALNYGYSILLSTTNRAIVSAGHITEIGMHHSSVANQYNLGSDLMEPFRPAIDYWVA---- 228 Query: 237 RRNPGEPDREVRLACRDIFRSS 258 + + ++ + Sbjct: 229 NQKFTDLTPNIKYGLIALLNLE 250 >UniRef50_B3E1C9 CRISPR-associated protein Cas1 n=1 Tax=Methylacidiphilum infernorum V4 RepID=B3E1C9_METI4 Length = 217 Score = 118 bits (295), Expect = 3e-25, Method: Composition-based stats. Identities = 30/171 (17%), Positives = 58/171 (33%), Gaps = 11/171 (6%) Query: 100 LYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIEGSRVRATYALLAKQYGVT 159 + Q D L ++ R L R + +RG EG R + A+Q Sbjct: 1 MRQRYRPDDHKLLVRFKRACVGL------VHARQLPAVRGWEGWASRHYWRWFAQQVN-Q 53 Query: 160 WNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFVY 217 G + D +N ++ + L A+ AG P +G +H G+ + V Sbjct: 54 LGGFEERRTHGQTQDPVNLALNYGYALLRHRLGVAVRLAGLDPYLGVLHEANGRHEALVS 113 Query: 218 DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 D+ +I + + V + +P V ++ + + + Sbjct: 114 DLVEIFRPE-VDRLVIRLIHLKMIQPKDFVFQDGL-LWLRPEARRRFVQSF 162 >UniRef50_Q03JI7 CRISPR-associated protein, Cas1 family n=40 Tax=Bacilli RepID=Q03JI7_STRTD Length = 289 Score = 116 bits (292), Expect = 7e-25, Method: Composition-based stats. Identities = 31/289 (10%), Positives = 86/289 (29%), Gaps = 40/289 (13%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGE 80 + ++ + + + I + + ++LE ++ V+ L+++ + Sbjct: 12 HSKLSYKNNHLIFRNSYKTEM-IHLSEIDILLLETTDIVLTTMLVKRLVDENILVIFCDD 70 Query: 81 AGVRV-YASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA---------PA 130 + + + S + + + +V + + + Sbjct: 71 KRLPTAFLTPYYARHDSSLQIARQIAWKENVKC-EVWTAIIAQKILNQSYYLGECSFFEK 129 Query: 131 RRSVEQL---------RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCIS 181 +S+ +L EG R + L +G + E + IN + Sbjct: 130 SQSIMELYHGLERFDPSNREGHSARIYFNTL---FGNDFT--------RESDNDINAALD 178 Query: 182 AATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRN 239 + L + ++ G IG H + DI + + + ++ N Sbjct: 179 YGYTLLLSMFAREVVVCGCMTQIGLKHANQFNQFNLASDIMEPFRP-IIDRIVYQNRHNN 237 Query: 240 PGEPDREVRLACRDIFR---SSKTLAKLIPLI-EDVLAAGEIQPPAPPE 284 + +E+ + + L+ ++ + V+ A PE Sbjct: 238 FVKIKKELFSIFSETYLYNGKEMYLSNIVSDYTKKVIKALNQLGEEIPE 286 >UniRef50_C7XMU1 CRISPR-associated protein cas1 n=2 Tax=Fusobacterium RepID=C7XMU1_9FUSO Length = 292 Score = 116 bits (291), Expect = 9e-25, Method: Composition-based stats. Identities = 35/255 (13%), Positives = 88/255 (34%), Gaps = 36/255 (14%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGT-RVSHAAVRLAAQVGTLLVWVGE 80 ++D+ + + G I +G V ++LE T ++ A + + +++ E Sbjct: 12 RSKLDLRYNSISIRRDNGTDF-IHIGEVNTLILETTTISITAALMCELIKQKVKVIFCDE 70 Query: 81 AGVRVY-ASGQPGGARSDKLLYQAKLALDE--------------DLRLKVVRKMFELRFG 125 + G + + D + ++K+++K+ + + Sbjct: 71 KSNPHFELLPFYGSHDCSAKIKEQIAWTDFLKESLWTIIVTEKIENQMKLLKKLNKEEYK 130 Query: 126 EPAPARRSVE--QLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAA 183 +E EG + ++ L +G ++ K +++N ++ Sbjct: 131 ILQEYASQIEHNDNTNREGHSAKIYFSAL---FGNNFS--------RNKENSLNAFLNYG 179 Query: 184 TSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPG 241 L I+A GY IG H + D+ + + V A+ + NP Sbjct: 180 YQLLLSTFNKEIVANGYLTQIGIFHKNMFNYYNLSSDLMEPFR-VIVDELAY---KENPQ 235 Query: 242 EPDREVRLACRDIFR 256 + +++ + ++I Sbjct: 236 KFEKDEKRKLQNILN 250 >UniRef50_C2EF73 CRISPR-associated Cas1 family protein n=1 Tax=Lactobacillus salivarius ATCC 11741 RepID=C2EF73_9LACO Length = 308 Score = 116 bits (290), Expect = 1e-24, Method: Composition-based stats. Identities = 36/295 (12%), Positives = 94/295 (31%), Gaps = 43/295 (14%) Query: 9 IPLKDRVSMIFLQYGQ-IDVIDGAFVLIDK-TGIRTHIPVGSVACIMLEPGTR-VSHAAV 65 I ++D +I+++ + + +F L + + P V ++ + + +S + Sbjct: 5 IVMRD---IIYIENKHFVGITKDSFKLKNVVDESERYFPFDEVDYLIFDSRSSFISERVI 61 Query: 66 RLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFG 125 + L++ + G +LA+ + + ++ +K+ + + Sbjct: 62 TTCVENNIGLIFCDKTHTPQLLMTSIYGQNERFKRQSKQLAMSKKTKGRIWQKIIKTKIN 121 Query: 126 EP-------APARRSVEQLRGI------------EGSRVRATYALLAKQYGVTWNGRRYD 166 + + +R I E R ++ L + + Sbjct: 122 NQADCLKYVVKNDKVSDIVRSIGKTVTEGDKNNHEAYAARVYFSNL--------FSKSFK 173 Query: 167 PKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIK 224 ++ D IN ++ + L ++ G P+ G H F DI ++ + Sbjct: 174 RGRYD--DIINSSLNYGYAILRSAIRKELVIYGLEPSWGIHHVSTENPFNLSDDIIEVYR 231 Query: 225 FDTVVPKAFEIARRNPG-EPDREVRLACRDIFRSSKTLA----KLIPLIEDVLAA 274 E+ N E D E++ + + L+ I+ + + Sbjct: 232 PFL-DALVVELLNNNESEELDIELKKEIIKVLFEKCIIDNKVYNLLDAIKITIKS 285 >UniRef50_C8PNV3 CRISPR-associated protein Cas1 n=1 Tax=Treponema vincentii ATCC 35580 RepID=C8PNV3_9SPIO Length = 296 Score = 115 bits (289), Expect = 2e-24, Method: Composition-based stats. Identities = 32/282 (11%), Positives = 89/282 (31%), Gaps = 38/282 (13%) Query: 17 MIFLQYGQ-IDVIDGAFVLIDKT-GIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGT 73 +F + + V V+ + T +P+ + +++E ++ + + Sbjct: 5 TLFFSHAVCLSVKHKQLVIFSEETQEETLVPIEDIGFVIVENERVSLTIPLINELTENNC 64 Query: 74 LLVWVGEAGVRVYASGQPGGARSDKLLYQ---AKLALDEDLRLKVVRKM--FELRFGEPA 128 L++ E + +++ Q A++ ++ K +++ ++++ Sbjct: 65 ALIFCNEKHMPF---SMTMPLDCNEIQSQLFSAQINAKLPVKKKCWKQVVEYKIKNQGLL 121 Query: 129 PARRSVEQLR--------------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGD 174 + ++ R +E + + L +G W G+ Sbjct: 122 LKKYDLDFARLADFSKRVKSGDSTNMESQAAKFYWDNL---FGKNW-------CRNRFGE 171 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLS--FVYDIADIIKFDTVVPKA 232 N ++ + L T A+ +G PA+G H K + D+ + + Sbjct: 172 FPNNYLNYGYAILRAATARALAGSGLLPALGIHHHNKYNAYCLADDLMEPYRPFIDDEVI 231 Query: 233 -FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 + + E E + + + L + L+ Sbjct: 232 EYISTNPDEKELGLEFKKKILKVLTRDVKMNNLTRPMMAALS 273 >UniRef50_UPI00016B206F hypothetical protein cdiviTM7_00753 n=1 Tax=candidate division TM7 single-cell isolate TM7c RepID=UPI00016B206F Length = 296 Score = 115 bits (287), Expect = 3e-24, Method: Composition-based stats. Identities = 34/263 (12%), Positives = 79/263 (30%), Gaps = 37/263 (14%) Query: 22 YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGE 80 ++ + D V+ +T +P+ + ++L+ G + + A GT + E Sbjct: 11 PARLSLRDNQLVIAQETE--ATLPIEDIDSLILDGYGITTTTNLLAALATKGTTTIICDE 68 Query: 81 AGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP----ARRSVEQ 136 + + + ++A+ + L+ ++ +++ + A Sbjct: 69 KHLPASVLLPYSQHSRQAKVSRQQIAMSQPLKKQLWQQIIISKITNQADVLRSTGLDDAA 128 Query: 137 LR------------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAAT 184 LR E R + L W N ++ Sbjct: 129 LRTHISDVKSGDTSNRESIAARIYFDQLLDDA-------TRRKPIWH-----NTALNYGY 176 Query: 185 SCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGE 242 + + I A G + G H + SF D+ + + + ++A + G+ Sbjct: 177 AMVRSHIARHIAARGLVASQGIFHHNELNSFNLADDLIEPYRAAVDLYVLEKVAPLHVGD 236 Query: 243 PD----REVRLACRDIFRSSKTL 261 D + R DI + Sbjct: 237 RDASLTKHDRQLIIDILNYYVIM 259 >UniRef50_A8REI1 Putative uncharacterized protein n=2 Tax=unclassified Erysipelotrichaceae RepID=A8REI1_9FIRM Length = 300 Score = 114 bits (285), Expect = 4e-24, Method: Composition-based stats. Identities = 35/308 (11%), Positives = 94/308 (30%), Gaps = 56/308 (18%) Query: 17 MIFLQ---YGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTRVSHAAVRLAAQVG 72 +++++ + Q+ + + + G P+ + +++ + ++ V + Sbjct: 5 VLYIENQYHLQLYLDN--LKVETSQG-DIQFPISDIQILVIDHYRSTLTVPLVNKLTENN 61 Query: 73 TLLVWVGEAGVRV-YASGQPGG-ARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP- 129 ++ G + Y G A+S ++ Q +E ++ + +++ + + Sbjct: 62 VCVIICGIDHLPKSYILPMNGHFAQSGNIMKQIT-WSNEIKQI-LHQQIVKAKIFNQIEI 119 Query: 130 ------------------ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWE 171 + EG + + + + + Sbjct: 120 LKTNQCKYEVIKKLYEFYDTVDLGDATNREGLAAKMYFREMFGNDFIRFE---------- 169 Query: 172 KGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVV 229 D IN ++ S + + I+ GY P +G H GK F DI ++ + V Sbjct: 170 -DDVINAGLNYGYSIFRSLISSIIVGKGYLPNLGIFHKGKTNMFNLSDDIIEVFRP-IVD 227 Query: 230 PKAFEIARRNPGEPDREVRLACRDIFRSS-----------KTLAKLIPLIEDVLAAGEIQ 278 ++ + + R + + + + I V+ G+I Sbjct: 228 KYVYDNLKEELLFKSKH-REELIQLTVKKIEIDGKLQTVPNAVNEYVESILKVIETGKIT 286 Query: 279 PPAPPEDA 286 P Sbjct: 287 DFKFPSPH 294 >UniRef50_A8ABE8 CRISPR-associated protein Cas1 n=1 Tax=Ignicoccus hospitalis KIN4/I RepID=A8ABE8_IGNH4 Length = 310 Score = 114 bits (285), Expect = 4e-24, Method: Composition-based stats. Identities = 36/260 (13%), Positives = 80/260 (30%), Gaps = 43/260 (16%) Query: 22 YGQIDVIDGAFVLI---DKTGIRTHIPVGSVAC-IMLEPGTRVSHAAVRLAAQVGTLLVW 77 + + + D + P + + + G VS AA++L + ++ Sbjct: 8 HAFVGRKGYMITVRYKKDGKEVTEAFPALDIEMAVFVGKGITVSTAALQLLEEQNVPTLF 67 Query: 78 VGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGE----------- 126 G S Q + + +L +KV ++ + Sbjct: 68 HGVD-WSFVTINPVKVGWSRARKNQYSM-GETELGVKVAKEFIFGKLEGMSNVAKNLSYK 125 Query: 127 -----------------PAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKD 169 + ++++ ++ +E + + + R P+ Sbjct: 126 GKKPTPNSDYWRSEGRGELASCKNLDCVKKLEAEWSSKLWKDIVQFVP---GMRSRVPR- 181 Query: 170 WEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKFDT 227 D N+ + + LY V A++ AG P G +H + SFVYD +++ K Sbjct: 182 --GNDPPNRTLDYLYALLYSVCNHALVGAGLDPYAGLIHRERAGKLSFVYDFSEMFKP-M 238 Query: 228 VVPKAFEIARRNPGEPDREV 247 + R E + + Sbjct: 239 AIYVMATAIRTYKIELEGDF 258 >UniRef50_A7HP88 CRISPR-associated protein Cas1 n=1 Tax=Parvibaculum lavamentivorans DS-1 RepID=A7HP88_PARL1 Length = 311 Score = 114 bits (285), Expect = 5e-24, Method: Composition-based stats. Identities = 32/251 (12%), Positives = 81/251 (32%), Gaps = 16/251 (6%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ V + V+ + +P+ + ++++ + A + G ++ G Sbjct: 14 RLSVANKQLVIERPDLPKATLPIEDLGVVIVDDLRATYTQAVFIELLEAGATVMVTGRDH 73 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA--------PARRSV 134 + ++A++ E + + + + + + + Sbjct: 74 LPAGMMLPLDAHHIQTERHRAQVEASEPTKKRAWQALIRSKIAQQGIVLAHFTGEHGGLL 133 Query: 135 EQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAA 194 R + A A++Y G+ + +G +N ++ + + T A Sbjct: 134 PMARRVRSGDPDNLEAQAAQRYWPRLFGKDFRRDRDLEG--VNALLNYGYAVVRAATARA 191 Query: 195 ILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEP---DREVRL 249 +AAG P++G H + F D+ + + + P DR+ R Sbjct: 192 TVAAGLIPSLGVFHRNRANPFCLADDLLEPYRPYVDWRVRLLANQMGEEAPSLDDRDTRA 251 Query: 250 ACRDIFRSSKT 260 A IF + Sbjct: 252 ALLSIFNETVL 262 >UniRef50_Q5LZX6 Putative uncharacterized protein n=2 Tax=Streptococcus thermophilus RepID=Q5LZX6_STRT1 Length = 207 Score = 113 bits (283), Expect = 8e-24, Method: Composition-based stats. Identities = 33/210 (15%), Positives = 69/210 (32%), Gaps = 36/210 (17%) Query: 15 VSMIFLQYG--QIDVIDGAFVLI-DKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQV 71 +S ++ Q + + + ++ D I + + V ++L +++ ++ ++ Sbjct: 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60 Query: 72 GTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFE---------- 121 + + G + + K QAK +ED RL+V R + Sbjct: 61 KVNVYYFSNVGQFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHPIALL 120 Query: 122 --------------LRFGE---PAPARRSVEQLRGIEGSRVRATYALLAKQYG--VTWNG 162 RF + S+ ++ G EG ++ + L +NG Sbjct: 121 REFDTDGLLDTSDYSRFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPNDFHFNG 180 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTE 192 R P D N ++ S LY Sbjct: 181 RSRRP----GEDCFNSALNFGYSILYSCLM 206 >UniRef50_B5ZLL1 CRISPR-associated protein Cas1 n=2 Tax=Gluconacetobacter diazotrophicus PAl 5 RepID=B5ZLL1_GLUDA Length = 297 Score = 113 bits (283), Expect = 8e-24, Method: Composition-based stats. Identities = 38/284 (13%), Positives = 93/284 (32%), Gaps = 45/284 (15%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPG-TR 59 M W ++ ++ + V+ + G + V +AC++L+ Sbjct: 1 MAWRGVHIS-----------HPSRLTHRNRQLVVA-QDGGEVSLAVEDIACLILDTRQVS 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM 119 ++ + + A+ G ++ + + A++++ + L+ ++ + + Sbjct: 49 ITGSLLSALAENGVAMIVPDARHHPAGILLPFHQHHAQAHIAHAQISISQPLKKRLWQTL 108 Query: 120 FELRFGEP--------APARRSVEQLRG---------IEGSRVRATYALLAKQYGVTWNG 162 + P +++ + G +E RA +A L + Sbjct: 109 VVAKIRNQAALLDQLGRPQGQTIAAMAGRVASGDPGNVEAQAARAYWASLFSDF------ 162 Query: 163 RRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIA 220 + D N ++ + + A +A G PA G H K +F D+ Sbjct: 163 -----TRANENDRRNALLNYGYAIMRAAIARACVALGLLPAFGVHHASKTNAFNLVDDLI 217 Query: 221 DIIKFDTVVPKAFEIARRNPGE-PDREVRLACRDIFRSSKTLAK 263 + + V A + A + G+ E R I + + + Sbjct: 218 EPFRP-FVDRMAHDRALEHVGDTLSIEDRRQMSTILNDNAAIGR 260 >UniRef50_Q6KIQ8 Conserved expressed putative DNA-repair protein n=1 Tax=Mycoplasma mobile RepID=Q6KIQ8_MYCMO Length = 300 Score = 113 bits (282), Expect = 9e-24, Method: Composition-based stats. Identities = 31/284 (10%), Positives = 82/284 (28%), Gaps = 50/284 (17%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTR 59 M W + D ++L I + +L + + I+ + Sbjct: 1 MGWKIVEI--NTDEYVHLYLNNLYIKRKNEKILLN----------IRDIDTILFNNQYST 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRV-YASGQPGGARSDKLLYQAKLALDEDLRLKVVRK 118 +S + A+ ++++ E Y G S K++ + E + + + Sbjct: 49 ISIRLLTFLAKNNVNIIFMNEKNEPNSYLIPIEGNHNSLKVVAEQVKWTKEYKAI-LWKD 107 Query: 119 MFELRFG----------------------EPAPARRSVEQLRGIEGSRVRATYALLAKQY 156 + + + + ++ + EG + + + Y Sbjct: 108 IIKNKIHNQKNLLIKNKLFNNSEFGIEYFDDLINNVNLFDISNREGHAAKVYWNM---TY 164 Query: 157 GVTW-NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL-- 213 + ++K + +N ++ S L +I+ G P H Sbjct: 165 SKEFIRNNSATKLKFDKFEIVNAILNYGYSILRSSAIQSIIKKGLDPRFSIFHKSFSNFF 224 Query: 214 SFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 + D+ +I + V A+ + + + +D + Sbjct: 225 ALASDLIEIFRP-LVDEIAY------LHKDETFFDVKIKDELIN 261 >UniRef50_B1BJM4 Crispr-associated protein Cas1 n=2 Tax=Clostridium perfringens RepID=B1BJM4_CLOPE Length = 299 Score = 113 bits (282), Expect = 1e-23, Method: Composition-based stats. Identities = 25/245 (10%), Positives = 76/245 (31%), Gaps = 35/245 (14%) Query: 39 GIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSD 97 G +P+ ++ +++E ++ + ++ + E + V + Sbjct: 26 GEEFLVPIEDISVLLIEGVAVNLTARLLSKLSENNVATIICDEKHLPVSINIPINTHYKT 85 Query: 98 KLLYQAKLALDEDLRLKVVRKMFELRFGEPAPARRSVEQLRGIE--------------GS 143 + + + + ++ + + + + + ++ G E G+ Sbjct: 86 YKVLKQQFSQSAAFSKRIWQSIIKQKLINQG-KCLDILEISGYEFLKKLSDAVESGDKGN 144 Query: 144 R----VRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAG 199 R + + L G+ + D + IN ++ + L +++ G Sbjct: 145 REAIGAKYYFKNL--------FGKAFVRDD---ENGINSALNYGYTILRSAIARSLVMYG 193 Query: 200 YAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRS 257 + +G H + SF D + + + A + +E R+ D+ Sbjct: 194 FNTTLGVNHCNELNSFNLADDFIEPFRPIVDLWVAQNM--DYDDILTKEDRIGLVDLLNY 251 Query: 258 SKTLA 262 + Sbjct: 252 QCVID 256 >UniRef50_Q5HK87 CRISPR-associated protein Cas1 n=1 Tax=Staphylococcus epidermidis RP62A RepID=Q5HK87_STAEQ Length = 301 Score = 112 bits (280), Expect = 2e-23, Method: Composition-based stats. Identities = 35/287 (12%), Positives = 94/287 (32%), Gaps = 41/287 (14%) Query: 17 MIFLQ-YGQIDVIDGAFVLID--KTGIRTHIPVGSVACIMLEPGTR-VSHAAVRLAAQVG 72 +I+++ + + + + + + ++ + I+ + SH V + Sbjct: 4 VIYVENHYFVTAKENSIKFRNVIDKSEKFYL-FEEIEAIIFDHYKSYFSHKLVIKCIEND 62 Query: 73 TLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP----- 127 +++ + V G + Q++ L + ++ +K+ + Sbjct: 63 IAIIFCDKKHSPVTQLISSYGMVNRLKRIQSQFQLSGRTKDRIWKKIVINKIFNQTRCLE 122 Query: 128 -APARRSVEQLRGI------------EGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGD 174 +V+ + G E R + L G+++ + D Sbjct: 123 NNLHNENVKLMLGFAKEVSSGDKSNKEAHATRIYFKDL--------FGKQFKRGRYN--D 172 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKA 232 IN ++ S L + G+ ++G H K F DI ++ + V Sbjct: 173 VINSGLNYGYSILRSFINKELAIHGFEMSLGIKHQSKENPFNLADDIIEVFRP-FVDNIV 231 Query: 233 FEIA-RRNPGEPDREVRLACRDIFRSSKTLA----KLIPLIEDVLAA 274 +EI ++N D + ++ + +L+ ++ V+ + Sbjct: 232 YEIVFKKNIDTFDINEKKLLLNVLYERCIIDKKVVRLLDSVKIVIQS 278 >UniRef50_D1AUW5 CRISPR-associated protein Cas1 n=1 Tax=Streptobacillus moniliformis DSM 12112 RepID=D1AUW5_STRM9 Length = 308 Score = 111 bits (278), Expect = 3e-23, Method: Composition-based stats. Identities = 38/295 (12%), Positives = 86/295 (29%), Gaps = 47/295 (15%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLL 75 ++ Q I + ++ + +P+ ++ I+ E + +S + + + Sbjct: 6 VLITQKSYIHFENDNLIVEKDD-KKLSVPISDISIIVFESLESYISLRIISELSLRNITM 64 Query: 76 VWVGEAGVRV-YASGQPGGAR---SDKLLYQAKLALDE--DLRLKVVRKMFELRFGEPAP 129 ++ + V Y+ R L Q E +L + + E Sbjct: 65 IFCDYRHMPVAYSLPINQHYRIPYVHSLQGQQTSKSKEFVWEKLLKAKIRNSKKVLELEN 124 Query: 130 ARRSVEQLR-------------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 A + +L EG+ + + L YG + + D+I Sbjct: 125 ASSDLIELMQKYEDEVIGSDVQNREGTAAKVFFNYL---YGTNF-------CRQNERDSI 174 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFE 234 N + + G+A IG H+ + YD + + + F Sbjct: 175 NMALDYGYGVFRSAITRLLCTYGFATYIGVHHSSMMNAFNLTYDFIEPYRP-IIDYYVFN 233 Query: 235 IARRNPGEPD--REVRLACRDIFRSSK-----------TLAKLIPLIEDVLAAGE 276 R + + + R + ++ ++ LI + L G Sbjct: 234 HLYRFEKDDELKTDTRKELISLLNANIKVNEKEYTVLYSMELLIKSYLNFLEEGN 288 >UniRef50_C2CKI7 CRISPR associated protein n=1 Tax=Anaerococcus tetradius ATCC 35098 RepID=C2CKI7_9FIRM Length = 292 Score = 110 bits (275), Expect = 6e-23, Method: Composition-based stats. Identities = 32/257 (12%), Positives = 80/257 (31%), Gaps = 47/257 (18%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTR 59 MTW +++ + ++ + G ++ D + I + ++ I++ Sbjct: 1 MTWR-----------TVVISKRSKLFLKMGHMIVRDFENNISRIYLKDISHIIVTTTEAS 49 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVY--ASGQPGGARSDKLLYQAKLALDEDLRLKVVR 117 ++ A + + L+ E G +S S+K+ Q + D +V + Sbjct: 50 ITLALINEIQKQKIKLIICDEKGNPSAELSSYYNSYNSSEKIRAQIRWTSDIKG--EVWK 107 Query: 118 KMFELRFGEP--------APARRSVEQLRG---------IEGSRVRATYALLAKQYGVTW 160 ++ E + +E G E + + L Sbjct: 108 RIVEEKIKNQMHLLKSFRLEKSNLLEGYLGQVEIFDSTNREAHAAKVYFNSL-------- 159 Query: 161 NGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYD 218 G + D +N ++ S + +++ GY +G H + D Sbjct: 160 FGNDFSRNDDTN---LNANLNYGYSIILSAINRCVVSMGYLTQLGIFHENIYNQFNLSSD 216 Query: 219 IADIIKFDTVVPKAFEI 235 + + + V + ++ Sbjct: 217 LIEPWRP-MVDREVLDM 232 >UniRef50_Q5HSQ9 CRISPR-associated protein Cas1 n=12 Tax=Campylobacterales RepID=Q5HSQ9_CAMJR Length = 296 Score = 109 bits (272), Expect = 1e-22, Method: Composition-based stats. Identities = 30/264 (11%), Positives = 85/264 (32%), Gaps = 35/264 (13%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 +++ ++++ V+ + + + + I+LE +S A A+ + Sbjct: 9 TLLISSNAKLNLELNHLVI-KQDENIAKLFLKDINIIVLESLQVSISSALFNAFARHKII 67 Query: 75 LVWVGEAG--VRVYASGQPGGARSDKLLYQAKLALDEDLRLK-------------VVRKM 119 L+ E V+ + Q ++ + L +++K Sbjct: 68 LLTCDETHSINGVFTPFLGHFQSAKIAKEQMNVSAQKKAILWQKIIKNKILNQAFILKKY 127 Query: 120 FELRFGEPAPA---RRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 ++ + S+ + IE + L +G +++ ++ Sbjct: 128 NKIEQSNELINLAKKVSLNDSKNIEAVAAALYFKTL---FGTSFS--------RDELCFE 176 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFE 234 N ++ + + A+ +G P +G H SF D+ ++ + V + Sbjct: 177 NSALNYGYAIIRACIIRAVCISGLLPWLGIKHDNIYNSFALCDDLIEVFRAS-VDDCVLK 235 Query: 235 IARRNPGEPDREVRLACRDIFRSS 258 + + ++ + A +S Sbjct: 236 LKGESEF-LSKDDKRALIGNLQSK 258 >UniRef50_D2PIT7 CRISPR-associated protein Cas1 n=2 Tax=Sulfolobus RepID=D2PIT7_SULIS Length = 255 Score = 105 bits (262), Expect = 2e-21, Method: Composition-based stats. Identities = 43/241 (17%), Positives = 79/241 (32%), Gaps = 39/241 (16%) Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKM 119 +S + + + G+ V S D Q + ED ++ + ++ Sbjct: 1 MSSEVLLFLSSQNVPVAIHGKYSDVVLVSPFMNSMS-DVRSKQY--CMSEDRKIILAKRF 57 Query: 120 FELRFGEPAPARR---------------------SVEQLRGIEGSRVRATYALLAKQYGV 158 E + + + LR IE R + L K Sbjct: 58 IEGKVRGMFNVAKYFGYLNQIEVKVEELDLRGVNDLNSLRLIEAEYGRKAWDELKKFLPR 117 Query: 159 TWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHT--GKPLSFV 216 + GR D IN+ I A S +Y + A++A G P G +H+ LSF Sbjct: 118 EFTGR-----KPRNEDPINRAIDYAYSIIYSLCTHALIAVGLDPYAGVMHSNFPGRLSFT 172 Query: 217 YDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLAAGE 276 YD +++ K V +RR D+ + S ++ + + +L + Sbjct: 173 YDFSEMFK-SVAVHVVISSSRRVKLSLDK-------KGYLSKESAEYITKYLYTILRKKK 224 Query: 277 I 277 + Sbjct: 225 V 225 >UniRef50_C4ZA17 Putative uncharacterized protein n=2 Tax=Eubacterium RepID=C4ZA17_EUBR3 Length = 302 Score = 105 bits (262), Expect = 2e-21, Method: Composition-based stats. Identities = 39/284 (13%), Positives = 86/284 (30%), Gaps = 39/284 (13%) Query: 17 MIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTLL 75 ++ I V ++ K G IP+ ++ I+L+ + +S + A+ G L Sbjct: 6 ILIENEVTIKVKLNNLII-TKCGEDIWIPLDDISMIVLDNLASNLSTRLMCQLAEQGIGL 64 Query: 76 VWVGEAGVR--VYASGQPGGARSDKLLYQA--------KLALD--EDLRLKVVRKMFELR 123 + + + Y++ S + +Q KL + + + Sbjct: 65 MLYNQQHLPTGFYSAYDNHSRASKVIGFQIDKEQDYYGKLWQQIVKIKIENQAKAYQIMT 124 Query: 124 FGEPAPAR---RSVEQLRG----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTI 176 P + S L G E + + LL G + + + + Sbjct: 125 RDSDGPEKIIEFSKNILIGDKSNREAHAAKVYFNLL--------MGTSFSRGNEDI--LL 174 Query: 177 NQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFE 234 N + + + G A + G IG H + V D+ + ++ + Sbjct: 175 NSGLDYGYAVIRGYIARACVGYGLNTQIGIHHKSEYNRFNLVDDLMEPLRPIVDIVAYNS 234 Query: 235 IARRNPGEPDREVRLACRDIFRSS----KTLAKLIPLIEDVLAA 274 + P+ R I + + +IE+ + Sbjct: 235 MKNDKYFTPEH--RRKLVSILNMKIMYREKKMYMCNMIENYIEQ 276 >UniRef50_UPI0001977683 putative CRISPR-associated protein n=1 Tax=Helicobacter cinaedi CCUG 18818 RepID=UPI0001977683 Length = 298 Score = 104 bits (259), Expect = 5e-21, Method: Composition-based stats. Identities = 32/270 (11%), Positives = 83/270 (30%), Gaps = 37/270 (13%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTL 74 S++ ++ + VL + + + I+ P ++ A + A+ + Sbjct: 9 SVLVSSEAKLSLQANHLVLKQTN-KEAKLFLKDIHFVILESPQILITSALLSAFAKHNIV 67 Query: 75 LVWVGEAGVRVYA-SGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPA----- 128 L+ E+ G + K+ + + + V +K+ + + A Sbjct: 68 LLTCDESHHINGIMHSYLGHFQHAKIAKEQMMVSTHKKAI-VWQKIVKNKITNQAHILKL 126 Query: 129 --PARRSVEQL-----------RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDT 175 ++ S E L R +E + L G+ + ++ + Sbjct: 127 HHKSKESDELLSFAKNVSLGDSRNLEAVAAAIYFKAL--------FGKEFHR---DEINF 175 Query: 176 INQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAF 233 IN ++ + L + +G +G H SF D+ ++ + V Sbjct: 176 INSALNYGYAILRACIVRNVCISGLITWLGIKHDNMYNSFNLCDDLIEVFRPW-VDLCVL 234 Query: 234 EIAR-RNPGEPDREVRLACRDIFRSSKTLA 262 ++ + +I + + + Sbjct: 235 KLNTLDKEATLRPNDKRELINILQQAALID 264 >UniRef50_A1RZT8 CRISPR-associated protein Cas1 n=1 Tax=Thermofilum pendens Hrk 5 RepID=A1RZT8_THEPD Length = 327 Score = 103 bits (258), Expect = 6e-21, Method: Composition-based stats. Identities = 44/275 (16%), Positives = 89/275 (32%), Gaps = 36/275 (13%) Query: 24 QIDVIDGAFVLIDKTGIRTHIPVGSVA-CIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAG 82 ++ V + V+I + V +++ G ++S + + A G L + + Sbjct: 11 EVTVSSRSTVVIKSGNRVFERALRDVDAVLVVGSGIKISSSLPPVLALHGIPLSILAKGH 70 Query: 83 VRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP--------------- 127 V V G ++ Q L + L + + + R Sbjct: 71 VAV-LLNPVGTKYNNYRALQYTLPKN--KALAIALEYLKSRVRGMASIIRNRGGRLPALP 127 Query: 128 --------APARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDP-----KDWEKGD 174 R +R E + + + K + + + D Sbjct: 128 EPPDPALYEDPARLESDIRSWEAAASNTLWDEVFKLLDPSAARELRERYGFAGRKPGHPD 187 Query: 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGK-PLSFVYDIADIIKFDTVVPKAF 233 +N+ ISA + LY ++ A++AAG P GF+H + + +D A+ K V A Sbjct: 188 PLNKAISAMYAVLYTLSTKALVAAGLDPTYGFLHRTQYSVPLAFDYAEAFKP-LAVEAAL 246 Query: 234 EIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLI 268 ++ E +D K + +L + Sbjct: 247 DLVNEEGLPTLSEDGDLDKDSLN--KAMKRLYRYL 279 >UniRef50_Q73QW5 CRISPR-associated protein Cas1 n=7 Tax=Bacteria RepID=Q73QW5_TREDE Length = 290 Score = 102 bits (255), Expect = 1e-20, Method: Composition-based stats. Identities = 41/290 (14%), Positives = 92/290 (31%), Gaps = 32/290 (11%) Query: 16 SMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEP-GTRVSHAAVRLAAQVGTL 74 +++ ++D+ V+ + + ++ +++E ++ A + + Sbjct: 5 TVVISNRAKLDLHLNHLVVR--GEKTQKVFIEEISVLIIETTAVSITAALLNELIKQKVK 62 Query: 75 LVWVGEAGVRVY-ASGQPGGARSDKLLYQAKLALDEDLRL--------KVVRKMFELRFG 125 +++ E G G + + + +L K+ ++ + L Sbjct: 63 VIFCDEKRNPASELIGYYGSHDTSEKIRLQIKWDKNIKQLVWTEVVTEKIRQQKYLLEKL 122 Query: 126 EPAPARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATS 185 A E + I+ + A AK Y G + IN ++ S Sbjct: 123 NLPQASLLAEYITDIDINDKTNKEAHAAKAYFAALFGAGFSRSLDI---PINAALNYGYS 179 Query: 186 CLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEP 243 L I+A GY +G H P + D+ + + V + F++ NP + Sbjct: 180 ILLSAFNREIIANGYITQLGIFHDNMFNPFNLGSDLMEPFRP-LVDAEVFKL---NPQKF 235 Query: 244 DREVRLACRDIFRSS-----------KTLAKLIPLIEDVLAAGEIQPPAP 282 + E +L + K + + I D L +I Sbjct: 236 EHEEKLKIVSVINKKVLINNKEHYLNKAIEIFVHSIFDALNEKDISQINF 285 >UniRef50_A7BYT7 Putative uncharacterized protein n=1 Tax=Beggiatoa sp. PS RepID=A7BYT7_9GAMM Length = 165 Score = 101 bits (251), Expect = 4e-20, Method: Composition-based stats. Identities = 17/133 (12%), Positives = 39/133 (29%), Gaps = 5/133 (3%) Query: 138 RGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILA 197 GIEGS + +A + W + + D +N +S + + Sbjct: 1 MGIEGSIAQQYFAKWRILWDDMWGFKERNR--RPPRDPVNALLSLSYTLAGNSIGQLAST 58 Query: 198 AGYAPAIGFVHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIF 255 G ++GF+H S D+ + ++ ++ + P + + Sbjct: 59 RGLDISLGFLHAPLNGRPSLALDLLEPVRPWI-DQWIWQQVDKGLLTPKQFSNNLEQGCR 117 Query: 256 RSSKTLAKLIPLI 268 + Sbjct: 118 LDKEGRQAFFSAW 130 >UniRef50_C5NZ03 CRISPR-associated protein Cas1 n=2 Tax=Firmicutes RepID=C5NZ03_9BACL Length = 291 Score = 100 bits (249), Expect = 8e-20, Method: Composition-based stats. Identities = 28/256 (10%), Positives = 69/256 (26%), Gaps = 46/256 (17%) Query: 1 MTWLPLNPIPLKDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML-EPGTR 59 MTW I ++DR ++D + + R + + + I++ Sbjct: 1 MTW---RTIIVRDR--------AKLDYSLNFLTVRKEAETRK-VSLSEIYMIIIESTAVS 48 Query: 60 VSHAAVRLAAQVGTLLVWVGEAGVRVY-ASGQPGGARSDKLLYQAKLALDEDLRLKVVRK 118 ++ + + ++ E G + E ++ + Sbjct: 49 ITAVLLNELMKNKIKVILCDEKRNPSSELIPYYGSHDTSLKYKNQIEWSAESKE-RIWTR 107 Query: 119 MFELRFGEPA--------PARRSVEQLR---------GIEGSRVRATYALLAKQYGVTWN 161 + + + +EQ EG + + + +G+ ++ Sbjct: 108 IVYEKINNQMLLLKKLGKVEYKLLEQYLTELEWNDSSNREGHAAKVYFNAM---FGMDFS 164 Query: 162 GRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTG--KPLSFVYDI 219 K N + S + I++ GY +G H + D Sbjct: 165 --------RNKECFTNAALDYGYSIILSAFNREIVSCGYFTQLGLCHRNPYNKFNLASDF 216 Query: 220 ADIIKFDTVVPKAFEI 235 + + V F + Sbjct: 217 MEPFR-ILVDETVFSL 231 >UniRef50_D0WRI5 CRISPR-associated protein Cas1, NMENI subtype n=3 Tax=Actinomycetales RepID=D0WRI5_9ACTO Length = 238 Score = 98.9 bits (245), Expect = 2e-19, Method: Composition-based stats. Identities = 41/238 (17%), Positives = 64/238 (26%), Gaps = 35/238 (14%) Query: 75 LVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDEDLRLKVVRKMFELRFGEPAP----- 129 +++ GV A+ ++A+ L + ++ + A Sbjct: 1 MLFCDWKGVPEGAAFSWSSHGRVGARHRAQACLSIPRQKNAWGRIIRAKVEGQAAVLREY 60 Query: 130 ARRSVEQLR------------GIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTIN 177 R +LR IE R + L R P N Sbjct: 61 NRVDTTELRSLAREVRSGDPGNIEARAARLYWQAL-----WGEESFRRHPGLGSGESCRN 115 Query: 178 QCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEI 235 + A + L G A+L AG +P IG H G+ +F DI + + Sbjct: 116 SHLDYAYTVLRGHGIRAVLGAGLSPTIGLFHHGRSNNFALVDDIIEPFRPAIDSSV--SR 173 Query: 236 ARRNPGEPDREVRLACRDIFRS---------SKTLAKLIPLIEDVLAAGEIQPPAPPE 284 N D EVR S L +L + P P Sbjct: 174 LAPNADMKDPEVRKHLVAAADQRFLPDGRRISAALDELAQHFGQYIEGELNTLPVPSW 231 >UniRef50_Q4A5I1 Putative uncharacterized protein n=5 Tax=Mycoplasma RepID=Q4A5I1_MYCS5 Length = 295 Score = 97.7 bits (242), Expect = 5e-19, Method: Composition-based stats. Identities = 36/255 (14%), Positives = 81/255 (31%), Gaps = 37/255 (14%) Query: 33 VLIDKTGIRTHIPVGSVACIMLE-PGTRVSHAAVRLAAQVGTLLVWVGEAGVRVY-ASGQ 90 +++ K + +P+ + I++ P +S + L+ + Sbjct: 21 LIVKKENNKIVLPLSDIDTILISNPYCTISVPLINAIVSNNINLIICNKDFEPNVQLLSI 80 Query: 91 PGGARSDKLLYQAKLALDEDLRLKVVRKMFELR-------------FGEPAPARRSV--- 134 G + L Q ++ + K K+ +L+ + + + Sbjct: 81 SGYYSNKNFLSQINW--TQEFKDKTWEKIIKLKTTNYVNLIYFFGLLNKEDVEKFNFYYK 138 Query: 135 EQLRG----IEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGV 190 + + G +EG + T+ L YG T+N +K + IN+ ++ + L Sbjct: 139 KIIPGDKNSMEGHIAKLTFKNL---YGSTFN-------RTDKENEINKFLNYGYTILMTY 188 Query: 191 TEAAILAAGYAPAIGFVHT--GKPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVR 248 ++ GY IG H + D+ + + + +E+ + Sbjct: 189 VSRNLVKKGYDNRIGVFHKSFNNHFALATDLMEPFR-FLIDKLVYELLIIEKNYDFINFK 247 Query: 249 LACRDIFRSSKTLAK 263 IF L K Sbjct: 248 KKVFLIFEEKILLNK 262 >UniRef50_C5F1H0 Crispr-protein cas1 n=1 Tax=Helicobacter pullorum MIT 98-5489 RepID=C5F1H0_9HELI Length = 285 Score = 96.9 bits (240), Expect = 8e-19, Method: Composition-based stats. Identities = 26/264 (9%), Positives = 81/264 (30%), Gaps = 38/264 (14%) Query: 46 VGSVACIML-EPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAK 104 + + C++L P ++ A + A+ L++ G ++ + + + Sbjct: 6 IKDLHCVVLESPQITITQALLSALAESKVLVLTCDRTHAINGVFTPFLGHFANAQVAREQ 65 Query: 105 LALDEDLRLKVVRKMFELRFGEP---------APARRSVEQL---------RGIEGSRVR 146 +A+ + + K+ +++ + + + ++ +E Sbjct: 66 IAVSTESKAKLWQQIVQNKIANQASVLQSCGYITEAVELARMCEKVEADDASNVEAKAAA 125 Query: 147 ATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGF 206 + L +G+ ++ + N + + ++ +G G Sbjct: 126 LYFKTL---FGIGFSRKAKHKIH-------NALLDYGYVIVRSCVIRSVCMSGLLTWSGI 175 Query: 207 VHTG--KPLSFVYDIADIIKFDTVVPKAFEIARRNPGEPDREVRLACRDIFRSSKTLAKL 264 H+ + DI ++ + V + + D V + S L + Sbjct: 176 KHSNQFNQFNLCDDIIEVFRP-FVDRCVLGLLESRGKDGDYAVVCDLSE-LDSKSYLETM 233 Query: 265 IPL-----IEDVLAAGEIQPPAPP 283 IE++ + ++ + P Sbjct: 234 TKEDKRALIENLQSEAKVGEQSFP 257 >UniRef50_C5EZ73 Crispr-protein cas1 n=1 Tax=Helicobacter pullorum MIT 98-5489 RepID=C5EZ73_9HELI Length = 230 Score = 91.9 bits (227), Expect = 3e-17, Method: Composition-based stats. Identities = 27/198 (13%), Positives = 61/198 (30%), Gaps = 33/198 (16%) Query: 50 ACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGARSDKLLYQAKLALDE 109 ++ G +S ++ A + ++ E ++ A + + QA + Sbjct: 39 RVVIETLGINLSSNFIKECAISKIQIDFI-ENNIQYAQLVAYNPAMTKIITMQAGIMGTP 97 Query: 110 DLRLKVVRKMFELRFGEP---------------------------APARRSVEQLRGIEG 142 ++ + R+ + ++++QL GIEG Sbjct: 98 -KQIFLAREFIYSKIKNQRNYLKYLSKYHNIINQTILDLDRYIKKLDMAKNIKQLMGIEG 156 Query: 143 SRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAP 202 + R + D +N + A + L+G +++I+ A P Sbjct: 157 KCAVLYWNTFRHMAKF----RGFHRIKRNAKDVLNASFNYAYAILHGSIQSSIIKAELNP 212 Query: 203 AIGFVHTGKPLSFVYDIA 220 I F+H S + Sbjct: 213 HISFLHIQNSKSLHLVLI 230 >UniRef50_B9CMG3 CRISPR-associated protein Cas1, NMENI subtype n=1 Tax=Atopobium rimae ATCC 49626 RepID=B9CMG3_9ACTN Length = 215 Score = 85.0 bits (209), Expect = 3e-15, Method: Composition-based stats. Identities = 27/191 (14%), Positives = 55/191 (28%), Gaps = 31/191 (16%) Query: 83 VRVYASGQPGGARSDKL--------LYQAKLALDEDLRLKV-----VRKMFELRFGEPAP 129 + +Y + ++L ++ D+ R+ + P Sbjct: 1 MPLYGAHNTPKRVVEQLGWSEPSKKRVWQRVVRDKITHQAQVLNARAREEIGQQLFGLIP 60 Query: 130 ARRSVEQLRGIEGSRVRATYALLAKQYGVTWNGRRYDPKDWEKGDTINQCISAATSCLYG 189 RS + E R + L +G ++ + IN + + L Sbjct: 61 EVRSGDT-TNREAHAARLYFHAL---FGHEFS--------RDDETPINAALDYGYAILLS 108 Query: 190 VTEAAILAAGYAPAIGFVHTGKPLSFVY--DIADIIKFDTVVPKAFEIARRNPGEPDREV 247 I+A GY G H + F D + + V F+ G+ ++ Sbjct: 109 AVNREIVARGYLTQSGICHRSEYNQFNLGCDFMEPFRP-IVDRLVFDNV---EGDFTKDT 164 Query: 248 RLACRDIFRSS 258 + D+ S Sbjct: 165 KRLLIDMLNQS 175 Database: uniref50.fasta Posted date: Mar 8, 2010 10:38 AM Number of letters in database: 1,040,396,356 Number of sequences in database: 3,077,464 Lambda K H 0.311 0.145 0.381 Lambda K H 0.267 0.0443 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 1,721,612,640 Number of Sequences: 3077464 Number of extensions: 68933458 Number of successful extensions: 230781 Number of sequences better than 1.0e-01: 250 Number of HSP's better than 0.1 without gapping: 389 Number of HSP's successfully gapped in prelim test: 169 Number of HSP's that attempted gapping in prelim test: 229261 Number of HSP's gapped (non-prelim): 642 length of query: 305 length of database: 1,040,396,356 effective HSP length: 128 effective length of query: 177 effective length of database: 646,480,964 effective search space: 114427130628 effective search space used: 114427130628 T: 11 A: 40 X1: 16 ( 7.2 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.2 bits) S2: 93 (40.3 bits)