summaryrefslogtreecommitdiffstats
Side-by-side diff
-rw-r--r--README35
-rw-r--r--data/ProteinNames.txt275
-rw-r--r--doc/Data Deployments.diabin0 -> 3566 bytes
3 files changed, 310 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..9caedb8
--- a/dev/null
+++ b/README
@@ -0,0 +1,35 @@
+Experiment 007
+Don Pellegrino [don@drexel.edu]
+
+Collection and inventory of influenza data.
+
+INTRODUCTION
+
+The "Influenza Virus Resource" at NCBI
+[http://www.ncbi.nlm.nih.gov/genomes/FLU/] exposes the sequence records and
+their meta-data in a number of different ways. An exploration of the
+phylogenetic properties of the records first requires that the available data
+be collected and inventoried.
+
+Two primary alternatives have been identified for managing the data. A
+relational database can be used. IBM DB2 has been used for this. The use of
+a relational database is limited by the difficulty in sharing the data. Each
+vendor uses incompatible import and export routines. Additionally installing
+an instance of a database management system (DBMS) often requires a large
+amount of effort and many not be practical on hosted environments which do not
+support the running of user daemons. Finally proper parallelization of a DBMS
+will require additional system specific configuration for each machine used.
+
+An alternative to the DBMS is to use a container file format such as HDF5.
+This has the advantage that all of the data can be collected into a single
+file which can then be shared with others. It has the disadvantage that is
+lacks the robust search and SQL operations provided by a DBMS. In addition to
+two alternatives use fundamentally different storage strategies with the DBMS
+using a relational model and the contain file format using a hierarchical
+model.
+
+The "doc/Data Deployments.dia" diagram shows the source systems that
+expose the various records as well as the transform routines that are
+used for aggregation of the data on the local system.
+
+ LocalWords: NCBI parallelization HDF SQL Pellegrino phylogenetic DBMS dia
diff --git a/data/ProteinNames.txt b/data/ProteinNames.txt
new file mode 100644
index 0000000..13ae313
--- a/dev/null
+++ b/data/ProteinNames.txt
@@ -0,0 +1,275 @@
+>A_PB2
+
+MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPER
+NEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVK
+IRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELRDCKISPLMVAYMLERE
+LVRKTRFLPVAGGTSSIYIEVLHLTQGTCWEQMYTPGGGVRNDDVDQSLIIAARNIVRRAAVSADPLASL
+LEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSVKKEEEVLTGNLQ
+TLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNF
+VNRANQRLNPMHQLLRHFQKDAKVLFQNWGVEHIDSVMGMIGVLPDMTPSTEMSMRGIRVSKMGVDEYSS
+TERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWEAV
+KIQWSQNPAMLYNKMEFEPFQSLVPKAIRSQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSR
+MQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLIIGKE
+DRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
+
+>A_PB1
+MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPID
+GPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEAVQQTRVDKLTQGRQTYDWTLNRNQPAA
+TALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQ
+RVNKRGYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKA
+KLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITKNQPEWFRNILSIAPIMFSNKMAR
+LGKGYMFESKRMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLG
+VSILNLGQKKYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINKTGTF
+EFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
+CHRGDTQIQTRRSFELKKLWDQTQSRAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDENYRGRLCNPLNP
+FVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFF
+PSSSYRRPIGISSMVEAMVSRARIDARIDFESGRIKKEEFSEIMKICSTIEELRRQK
+
+>A_PB1-F2
+MEQEQGTPWTQSTEHTNIQRRGSGRQIQKLGHPNSTQLMDHYLRIMNQVDMHKQTVSWRLWPSLKNPTQV
+SLRTHALKQWKPFNRQGWTN
+
+>A_PA
+
+MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNA
+LLKHRFEIIEGRDRTMAWTVVNSICNTTGAGKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKS
+ENTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEEKFEITGT
+MRRLADQSLPPNFSCLENFRAYVDGFEPNGCIEGKLSQMSKEVNAQIEPFLKTTPRPIKLPNGPPCYQRS
+KFLLMDALKLSIEDPSHEGEGIPLYDAIKCIKTFFGWKEPYIVKPHEKGINSNYLLSWKQVLSELQDIEN
+EEKIPRTKNMKKTSQLKWALGENMAPEKVDFENCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDS
+VWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCR
+TKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQISRP
+MFLYVRTNGTSKVKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSEAWPIGESPKGVEE
+GSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWV
+LLNASWFNSFLTHALK
+
+>A_HA
+MKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTIVKTITNDQIEVTNATELVQSSSTGEICDS
+PHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNES
+FNWTGVTQNGTSSACIRRSNNSFFSRLNWLTHLKFKYPALNVTMPNNEKFDKLYIWGVHHPGTDNDQIFL
+YAQASGRITVSTKRSQQTVIPNIGSRPRVRNIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGK
+SSIMRSDAPIGKCNSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLATGMRNVPEKQTRGIFGA
+IAGFIENGWEGMVDGWYGFRHQNSEGIGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEG
+RIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCD
+NACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVALLGFIMWACQKGNI
+RCNICI
+
+>A_NP
+MASQGTKRSYEQMETDGDRQNATEIRASVGKMIDGIGRFYIQMCTELKLSDHEGRLIQNSLTIEKMVLSA
+FDERRNKYLEEHPSAGKDPKKTGGPIYRRVDGKWMRELVLYDKEEIRRIWRQANNGEDATSGLTHIMIWH
+SNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGIGTMVMELIRMVKRGINDRNFWRGE
+NGRKTRSAYERMCNILKGKFQTAAQRAMVDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACA
+YGPAVSSGYDFEKEGYSLVGIDPFKLLQNSQIYSLIRPNENPAHKSQLVWMACHSAAFEDLRLLSFIRGT
+KVSPRGKLSTRGVQIASNENMDNMGSSTLELRSGYWAIRTRSGGNTNQQRASAGQTSVQPTFSVQRNLPF
+EKSTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEVSFRGRGVFELSDEKATNPIVPSFDMSNEGSYFFG
+DNAEEYDN
+
+>A_NA
+MNPNQKIITIGSVSLTISTICFFMQTAILITTVTLHFKQYEFNSPPNNQVMLCEPTIIERNITEIVYLTN
+TTIEKEICPKLAEYRNWSKPQCDITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPDKCYQFALGQGTTL
+NNVHSNDTVRDRTPYRTLLMNELGVPFHLGTKQVCIAWSSSSCHDGKAWLHVCITGDDKNATASFIYNGR
+LVDSIVSWSKEILRTQESECVCINGTCTVVMTDGSASGKADTKILFIEEGKIVHTSTLSGSAQHVEECSC
+YPRYPGVRCVCRDNWKGSNRPIVDINIKDHSIVSSYVCSGLVGDTPRKNDSSSSSHCLDPNNEEGGHGVK
+GWAFDDGNDVWMGRTISEKSRLGYETFKVIEGWSNPKSKLQINRQVIVDRGNRSGYSGIFSVEGKSCINR
+CFYVELIRGRKEETEVLWTSNSIVVFCGTSGTYGTGSWPDGADINLMPI
+
+>A_NA
+MNTNQRIITIGTICLIVGIISLLLQIGNIILLWMSHSIQTGEKSHPKVCNQSVITYENNTWVNQTYVNIS
+NTNIAAGQGVTPIILAGNSSLCPISGWAIYSKDNSIRIGSKGDIFVMREPFISCSHLECRTFFLTQGALL
+NDRHSNGTVKDRSPYRTLMSCPIGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNG
+IITDTIKSWRNKILRTQESECVCINGSCFTIMTDGPSNGQASYKLFKMEKGKIIRSIELDAPNYHYEECS
+CYPDTGKVVCVCRDNWHASNRPWVSFDQNLDYQIGYICSGVFGDNPRSNDGKGNCGPVLSNGANGVKGFS
+FRYGNGVWIGRTKSISSRSGFEMIWDPNGWTETDSSFSMKQDIIALTDWSGYSGSFVQHPELTGMNCIRP
+CFWVELIRGQPKESTIWTSGSSISFCGVNSGTASWSWPDGADLPFTIDK
+
+>A_M1
+MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFSGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPS
+ERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTT
+EVAFGLVCATCEQIADSQHRSHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEIASQAR
+QMVQAMRAIGTHPSSSTGLRDDLLENLQTYQKRMGVQMQRFK
+
+>A_M2
+PIRNEWGCRCNDSSDPLVVAANIIGILHLILWILDRLFFKCVYRLFKHGLKRGPSTEGVPE
+SMREEYRKEQQNAVDADDSHFVSIELE
+
+>A_NS1
+MDSNTVSSFQVDCFLWHIRKQVVDQELSDAPFLDRLRRDQRSLRGRGNTLGLDIKAATHVGKQIVEKILK
+EESDEALKMTMVSTPASRYITDMTIEELSRNWFMLMPKQKVEGPLCIRMDQAIMEKNIMLKANFSVIFDR
+LETIVLLRAFTEEGAIVGEISPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKNLQRFAWRSSNENG
+GPPLTPKQKREMARTARSKV
+
+>A_NS2
+DILLRMSKMQLGSSSEDLNGMITQFESLKIYRDSLGEAVMRMGDLHLLQNRNGKWREQLG
+QKFEEIRWLIEEVRHRLKTTENSFEQITFMQALQLLFEVEQEIRTFSFQLI
+>B_PB1
+MNINPYFLFIDVPIQAAISTTFPYTGVPPYSHGTGTGYTIDTVIRTHEYSNKGKQYISDVTGCTMVDPTN
+GPLPEDNEPSAYAQLDCVLEALDRMDEEHPGLFQAASQNAMEALMVTTVDKLTQGRQTFDWTVCRNQPAA
+TALNTTITSFRLNDLNGADKGGLIPFCQDIIDSLDRPEMTFFSVKNIKKKLPAKNRKGFLIKRIPMKVKD
+KITKVEYIKRALSLNTMTKDAERGKLKRRAIATAGIQIRGFVLVVENLAKNICENLEQSGLPVGGNEKKA
+KLSNAVAKMLSNCPPGGISMTVTGDNTKWNECLNPRIFLAMTERITRDSPIWFRDFCSIAPVLFSNKIAR
+LGKGFMITSKTKRLKAQIPCPDLFSIPLERYNEETRAKLKKLKPFFNEEGTASLSPGMMMGMFNMLSTVL
+GVAALGIKNIGNKEYLWDGLQSSDDFALFVNAKDEETCMEGINDFYRTCKLLGVNMSKKKSYCNETGMFE
+FTSMFYRDGFVSNFAMELPSFGVAGVNESADMAIGMTIIKNNMINNGMGPATAQTAIQLFIADYRYTYKC
+HRGDSKVEGKRMKIIKELWENTKGRDGLLVADGGPNIYNLRNLHIPEIVLKYNLMDPEYKGRLLHPQNPF
+VGHLSIEGIKEADITPAHGPVKKMDYDAVSGTHSWRTKRNRSILNTDQRNMILEEQCYAKCCNLFEACFN
+SASYRKPVGQHSMLEAMAHRLRMDARLDYESGRMSKDDFEKAMAHLGEIGYI
+
+>B_PB2
+
+MTLAKIELLKQLLRDNEAKTVLKQTTVDQYNIIRKFNTSRIEKNPSLRMKWAMCSNFPLALTKGDMANRI
+PLEYKGIQLKTNAEDIGTKGQMCSIAAVTWWNTYGPIGDTEGFERVYESFFLRKMRLDNATWGRITFGPV
+ERVRKRVLLNPLTKEMPPDEASNVIMEILFPKEAGIPRESTWIHRELIKEKREKLKGTMITPIVLAYMLE
+RELVARRRFLPVAGATSAEFIEMLHCLQGENWRQIYHPGGNKLTESRSQSMIVACRKIIRRSIVASNPLE
+LAVEIANKTVIDTEPLKSCLAAIDGGDVACDIIRAALGLKIRQRQRFGRLELKRISGRGFKNDEEILIGN
+GTIQKIGIWDGEEEFHVRCGECRGILKKSKMKLEKLLINSAKKEDMRDLIILCMVFSQDTRMFQGVRGEI
+NFLNRAGQLLSPMYQLQRYFLNRSNDLFDQWGYEESPKASELHGINESMNASDYTLKGVVVTRNVIDDFS
+STETEKVSITKNLSLIKRTGEVIMGANDVSELESQAQLMITYDTPKMWEMGTTKELVQNTYQWVLKNLVT
+LKAQFLLGKEDMFQWDAFEAFESIIPQKMAGQYSGFARAVLKQMRDQEVMKTDQFIKLLPFCFSPPKLRS
+NGEPYQFLKLVLKGGGENFIEVRKGSPLFSYNPQTEVLTICGRMMSLKGKIEDEERNRSMGNAVLAGFLV
+SGKYDPDLGDFKTIEELEKLKPGEKANILLYQGKPVKVVKRKRYSALSNDISQGIKRQRMTVESMGWALS
+
+>B_PA
+
+MDTFITRNFQTTIIQKAKNTMAEFSEDPELQPAMLFNICVHLEVCYVISDMNFLDEEGKAYTALEGQGKE
+QNLRPQYEVIEGMPRTIAWMVQRSLAQEHGIETPKYLADLFDYKTKRFIEVGITKGLADDYFWKKKEKLG
+NSMELMIFSYNQDYSLSNESSLDEEGKGRVLSRLTELQAELSLKNLWQVLIGEEDVEKGIDFKLGQTISR
+LRDISVPAGFSNFEGMRSYIDNIDPKGAIERNLARMSPLVSVTPKKLTWEDLRPIGPHIYDHELPEVPYN
+AFLLMSDELGLANMTEGKSKKPKTLAKECLEKYSTLRDQTDPILIMKSEKANENFLWKLWRDCVNTISNE
+ETSNELQKTNYAKWATGDGLTYQKIMKEVAIDDETMCQEEPKIPNKCRVAAWVQTEMNLLSTLTSKRALD
+LPEIGPDVAPVEHVGSERRKYFVNEINYCKASTVMMKYVLFHTSLLNESNASMGKYKVIPITNRVVNEKG
+ESFDMLYGLAVKGQSHLRGDTDVVTVVTFEFSSTDPRVDSGKWPKYTVFRIGSLFVSGREKSVYLYCRVN
+GTNKIQMKWGMEARRCLLQSMQQMEAIVEQESSIQGYDMTKACFKGDRVNSPKTFSIGTQEGKLVKGSFG
+KALRVIFTKCLMHYVFGNAQLEGFSAESRRLLLLIQALKDRKGPWVFDLEGMYSGIEECISNNPWVIQSA
+YWFNEWLGFEKEGSKVLESVDEIMDE
+
+>B_HA
+MKAIIVLLMVVTSNADRICTGITSSNSPHVVKTATQGEVNVTGVIPLTTTPTKSHFANLKGTETRGKLCP
+KCLNCTDLDVALGRPKCTGNIPSARVSILHEVRPVTSGCFPIMHDRTKIRQLPNLLRGYEHIRLSTHNVI
+NAENAPGGPYKIGTSGSCPNVTNGNGFFATMAWAVPKNDNNKTATNSLTIEVPYICTEGEDQITVWGFHS
+DNETQMAKLYGDSKPQKFTSSANGVTTHYVSQIGGFPNQTEDGGLPQSGRIVVDYMVQKSGKTGTITYQR
+GILLPQKVWCASGRSKVIKGSLPLIGEADCLHEKYGGLNKSKPYYTGEHAKAIGNCPIWVKTPLKLANGT
+KYRPPAKLLKERGFFGAIAGFLEGGWEGMIAGWHGYTSHGAHGVAVAADLKSTQEAINKITKNLNSLSEL
+EVKNLQRLSGAMDELHNEILELDEKVDDLRADTISSQIELAVLLSNEGIINSEDEHLLALERKLKKMLGP
+SAVEIGNGCFETKHKCNQTCLDRIAAGTFDAGEFSLPTFDSLNITAASLNDDGLDNHTILLYYSTAASSL
+AVTLMIAIFVVYMVSRDNVSCSICL
+
+>B_NP
+MSNMDIDGINTGTIDKAPEEITSGTSGTTRPIIRPATLAPPSNKRTRNPSPERATTIGEADVGRKTQKKQ
+TPTEIKKSVYNMVVKLGEFYNQMMVKAGLNDDMERNLIQNAHAVERILLAATDDKKTEFQKKKNARDVKE
+GKEEIDHNKTGGTFYKMVRDDKTIYFSPIRVTFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQ
+RSKALKRVGLDPSLISTFAGSTLPRRSGATGVAIKGGGTLVAEAIRFIGRAMADRGLLRDIKAKTAYEKI
+LLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARSMVVVRPSVASKVVLPISIYAKIPQLGFNVE
+EYSMVGYEAMALYNMATPVSILRVGDDAKDKSQLFFMSCFGAAYEDLRVLSALTGTEFKPRSALKCKGFH
+VPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVGGDGGSGQISCSPVFAVERPIALSKQAVRRMLSMNIE
+GRDADVKGNLLKMMNDSMAKKTNGNAFIGKKMFQISDKNKTNPVEIPIKQTIPNFFFGRDTAEDYDDLDY
+>B_NB
+MNNATFNYTNVNPISHIRGSIIITICVSFIIILTIFGYIAKILTNRNNCTNNAIGLCKCIKCSGCEPFCN
+KRGDTSSPRTGVDIPAFILPGLNLSESTPN
+
+>B_NA
+MLPSTIQTLTLFLTSGGVLLSLYVSASLSYLLYSDILLKFSPTEITAPTMPLDCANASNVQAVNRSATKG
+VTLLLPEPEWTYPRLSCPGSTFQKALLISPHRFGETKGNSAPLIIREPFIACGPNECKHFALTHYAAQPG
+GYYNGTRGDRNKLRHLISVKLGKIPTVENSIFHMAAWSGSACHDGKEWTYIGVDGPDNNALLKIKYGEAY
+TDTYHSYANKILRTQESACNCIGGNCYLMITDGSASGVSECRFLKIREGRIIKEIFPTGRVKHTEECTCG
+FASNKTIECACRDNSYTAKRPFVKLNVETDTAEIRLMCTDTYLDTPRPDDGSITGPCESNGDKGSGGIKG
+GFVHQRMASKIGRWYSRTMSKTERMGMGLYVKYDGDPWADSDALAFSGVMVSMKEPGWYSFGFEIKDKKC
+DVPCIGIEMVHDGGKETWHSAATAIYCLMGSGQLLWDTVTGVDMAL
+
+>B_M1
+
+MSLFGDTIAYLLSLTEDGEGKAELAEKLHCWFGGKEFDLDSALEWIKNKRCLTDIQKALIGASICFLKPK
+DQERKRRFITEPLSGMGTTATKKKGLILAERKMRRCVSFHEAFEIAEGHESSALLYCLMVMYLNPGNYSM
+QVKLGTLCALCEKQASHSHRAHSRAARSSVPGVRREMQMVSAMNTAKTMNGMGKGEDVQKLAEELQSNIG
+VLRSLGASQKNGEGIAKDVMEVLKQSSMGNSALVKKYL
+
+>B_BM2
+
+MLEPFQILSICSFILSALHFMAWTIGHLNQIKRGINMKIRIKGPNKETINREVSILRHSYQKEIQAKETM
+KEVLSDNMEVLSDHIIIEGLSAEEIIKMGETVLEIEELH
+
+>B_NS1
+MANNMTTTQIEVGPGATNATINFEAGILECYERLSWQRALDYPGQDRLNRLKRKLESRIKTHNKSEPESK
+RMSLEERKAIGVKMMKVLLFMNPSAGIEGFEPYCMKSSSNSNCTKYNWTDYPSTPGRCLDDIEEEPEDVD
+GPTEIVLRDMNNKDARQKIKEEVNTQKEGKFRLTIKRDMRNVLSLRVLVNGTFLKHPNGYKSLSTLHRLN
+AYDQSGRLVAKLVATDDLTVEDEEDGHRILNSLFERLNEGHSKPIRAAETAVGVLSQFGQEHRLSPEEGD
+N
+
+>B_NS2
+WRMKKMAIGSSTHSSSVLMKDIQSQFEQLKLRWESYPNLVKSTDYHQKRETIRLVTEEL
+YLLSKRIDDNILFHKTVIANSSIIADMVVSLSLLETLYEMKDVVEVYSRQCL
+
+>C_CM2
+MGRMAMKWLVVIICFSITSQPASACNLKTCLKLFNNTDAVTVHCFNENQGYMLTLASLGLGIITMLYLLV
+KIIIELVNGFVLGRWERWCGDIKTTIMPEIDSMEKDIALSRERLDLGEDAPDETDNSPIPFSNDGIFEI
+>C_M1
+MAHEILIAETEAFLKNVAPETRTAIISAITGGKSACKSAAKLIKNEHLPLMSGEATTMHIVMRCLYPEIK
+PWKKASDMLNKATSSLKKSEGRDIRKQMKAAGDFLGVESMMKMRAFRDDQIMEMVEEVYDHPDDYTPDIR
+IGTITAWLRCKNKKSERYRSNVSESGRTALKIHEVRKASTAMNEIAGITGLGEEALSLQRQTESLAILCN
+HTFGSNIMRPHLEKAIKGVEGRVGEMGRMAMK
+>C_NP
+MSDRRQNRKTPDEQRKANALIINENIEAYIAICKEVGLNGDEMLILENGIAIEKAIRICCDGKYQEKREK
+KAREAQRADSNFNADSIGIRLVKRAGSGTNITYHAVVELTSRSRIVQILKSHWGNELNRAKIAGKRLGFS
+ALFASNLEAIIYQRGRNAARRNGSAELFTLTQGAGIETRYKWIMEKHIGIGVLIADAKGLINGKREGKRG
+VDANVKLRAGTTGSPLERAMQGIEKKAFPGPLRALARRVVKANYNDAREALNVIAEASLLLKPQITNKMT
+MPWCMWLAARLTLKDEFANFCAYAGRRAFEVFNIAMEKIGICSFQGTIMNDDEIESIEDKAQVLMMACFG
+LAYEDFSLVSAMVSHPLKLRNRMKIGNFRVGEKVSTVLSPLLRFTRWAEFAQRFALQANTSREGAQISNS
+AVFAVERKITTDVQRVEELLNKVQAHEDEPLQTLYKKVREQISIIGRNKSEIKEFLGSSMYDLNDQEKQN
+PINFRSGAHPFFFEFDPDYNPIRVKRPKKPIAKRNSNISRLEEEGMDENSEIGQAKKMKPLDQLTSTSSN
+IPGKN
+>C_HE
+MFFSLLLMLGLTEAEKIKICLQKQVNSSFSLHNGFGGNLYATEEKRMFELVKPKAGASVLNQSTWIGFGD
+SRTDKSNSAFPRSADVSAKTADKFRSLSGGSLMLSMFGPPGKVDYLYQGCGKHKVFYEGVNWSPHAAINC
+YRKNWTDIKLNFQKNIYELASQSHCMSLVNALDKTIPLQATAGVAKNCNNSFLKNPALYTQEVNPSVEKC
+GKENLAFFTLPTQFGTYECKLHLVASCYFIYDSKEVYNKRGCDNYFQVIYDSSGKVVGGLDNRVSPYTGN
+SGDTPTMQCDMLQLKPGRYSVRSSPRFLLMPERSYCFDMKEKGPVTAVQSIWGKGRESDHAVDQACLSTP
+GCMLIQKQKPYIGEADDHHGDQEMRELLSGLDYEARCISQSGWVNETSPFTEEYLLPPKFGRCPLAAKEE
+SIPKIPDGLLIPTSGTDTTVTKPKSRIFGIDDLIIGLLFVAIVEAGIGGYLLGSRKVSGGGVTKESAEKG
+FEKIGNDIQILRSSTNIAIEKLNDRISHDEQAIRDLTLEIENARSEALLGELGIIRALLVGNISIGLQES
+LWELASEITNRAGDLAVEVSPGCWVIDNNICDQSCQNFIFKFNETAPVPTIPPLDTKIDLQSDPFYWGSS
+LGLAITAAISLAALVISGIAICRTK
+>C_P3
+MSKTFAEIAEAFLEPEAVRIAKEAVEEYGDHERKIIQIGIHFQVCCMFCDEYLSTNGSDRFVLIEGRKRG
+TAVSLQNELCKSYDLEPLPFLCDIFDREEKQFVEIGITRKADDSYFQSKFGKLGNSCKIFVFSYDGRLDK
+NCEGPMEEQKLRIFSFLATAADFLRKENMFNEIFLPDNEETIIEMKKGKTFLKLRDESVPLPFQTYEQMK
+DYCEKFKGNPRELASKVSQMQSNIKLPIKHYEQNKFRQIRLPKGPMAPYTHKFLMEEAWMFTKISDPERS
+RAGEILIDFFKKGNLSAIRPKDKPLQGKYPIHYKNLWNQIKAAIADRTMVINENDHSEFLGGIGRASKKI
+PEVSLTQDVITTEGLKQSENKLPEPRSFPKWFNAEWMWAIKDSDLTGWVPMAEYPPADNELEDYAEHLNK
+TMEGVLQGTNCAREMGKCILTVGALMTECRLFPGKIKVVPIYARSKERKSMQEGLPVPSEMDCLFGICVK
+SKSHLNKDDGMYTIITFEFSIREPNLEKHQKYTVFEAGHTTVRMKKGESVIGREVPLYLYCRTTALSKIK
+NDWLSKARRCFITTMDTVETICLRESAKAEENLVEKTLNEKQMWIGKKNGELIAQPLREALRVQLVQQFY
+FCIYNDSQLEGFCNEQKKILMALEGDKKNKSSFGFNPEGLLEKIEECLINNPMCLFMAQRLNELVIEASK
+RGAKFFKID
+>C_PB1
+MEINPYLMFLNNDVTSLISTTYPYTGPPPMSHGSSTKYTLETIKRTYDYSRTSVEKTSKVFNIPRRKFCN
+CLEDKDELVKPTGNVDISSLLGLAEMMEKRMGEGFFKHCVMEAETEILKMHFSRLTEGRQTYDWTSERNM
+PAATALQLTVDAIKETEGPFKGTTMLEYCNKMIEMLDWKEVKFRKVKTMVRREKDKRSGKEIKTKVPVMG
+IDSIKHDEFLIRALTINTMAKDGERGKLQRRAIATPGMIVRPFSKIVETVAQKICEKLKESGLPVGGNEK
+KAKLKTTVTSLNARMNSDQFAVNITGDNSKWNECQQPEAYLALLAYITKDSSDLMKDLCSVAPVLFCNKF
+VKLGQGIRLSNKRKTKEVIIKAEKMGKYKNLMREEYKNLFEPLEKYIQKDVCFLPGGMLMGMFNMLSTVL
+GVSTLCYMDEELKAKGCFWTGLQSSDDFVLFAVASNWSNIHWTIRRFNAVCKLIGINMSLEKSYGSLPEL
+FEFTSMFFDGEFVSNLAMELPAFTTAGVNEGVDFTAAMSIIKTNMINNSLSPSTALMALRICLQEFRATY
+RVHPWDSRVKGGRMKIINEFIKTIENKDGLLIADGGKLMNNISTLHIPEEVLKFEKMDEQYRNRVFNPKN
+PFTNFDKTIDIFRAHGPIRVEENEAVVSTHSFRTRANRTLLNTDMRAMMAEEKRYQMVCDMFKSVFESAD
+INPPIGAMSIGEAIEEKLLERAKMKRDIGAIEDSEYEEIKDIIRDAKKARIESR
+>C_PB2
+MSFLLTIAKEYKRLCQDAKAAQMMTVGTVSNYTTFKKWTTSRKEKNPSLRMRWAMSSKFPIIANKRMLEE
+AQIPKEHNNVALWEDTEDVSKRDHVLASASCINYWNFCGPCVNNSEVIKEVYKSRFGRLERRKEIMWKEL
+RFTLVDRQRRRVDTQPVEQRLRTGEIKDLQMWTLFEDEAPLASKFILDNYGLVKEMRSKFANKPLNKEVV
+AHMLEKQFNPESRFLPVFGAIRPERMELIHALGGETWIQEANTAGISNVDQRKNDMRAVCRKVCLAANAS
+IMNAKSKLVEYIKSTSMRIGETERKLEELILETDDVSPEVTLCKSALGGPLGKTLSFGPMLLKKISGSGV
+KVKDTVYIQGVRAVQFEYWSEQEEFYGEYKSATALFSRKERSLEWITIGGGINEDRKRLLAMCMIFCRDG
+DYFKDAPATITMADLSTKLGREIPYQYVMMNWIQKSEDNLEALLYSRGIVETNPGKMGSSMGIDGSKRAI
+KSLRAVTIQSGKIDMPESKEKIHLELSDNLEAFDSSGRIVATILDLPSDKKVTFQDVSFQHPDLAVLRDE
+KTAITKGYEALIKRLGTGDNDIPSLIAKKDYLSLYNLPEVKLMAPLIRPNRKGVYSRVARKLVSTQVTTG
+HYSLHELIKVLPFTYFAPKQGMFEGRLFFSNDSFVEPGVNNNVFSWSKADSSKIYCHGIAIRVPLVVGDE
+HMDTSLALLEGFSVCENDPRAPMVTRQDLIDVGFGQKVRLFVGQGSVRTFKRTASQRAASSDVNKNVKKI
+KMSN
+>C_NS1
+MSDKTVKSTNLMAFVATKMLERQEDLDTCTEMQVEKMKTSTKARLRTESSFAPRTWEDAIKDGELLFNGT
+ILQAESPTMTPASVEMKGKKFPIDFAPSNIAPIGQNPIYLSPCIPNFDGNVWEATMYHHRGATLTKTMNC
+NCFQRTIWCHPNPSRMRLSYAFVLYCRNTKKICGYLIAKQVAGIETGIRKCFRCIKSGFVMATDEISLTI
+LQSIKSGAQLDPYWGNETPDIDKTEAYMLSLREAGP
+>C_NS2
+EILRRSVD
+TSSLNKWPELKQELENVSDALKADSLWLPMKSLSLYSKVSNQEPSSIPIGEMKHQILTRLKLICSRLEKL
+DLNLSKAVLGIQNSEDLILIIYNRDVCKNTILMIKSLCNSLI
diff --git a/doc/Data Deployments.dia b/doc/Data Deployments.dia
new file mode 100644
index 0000000..b8ad4af
--- a/dev/null
+++ b/doc/Data Deployments.dia
Binary files differ

Valid XHTML 1.0 Strict

Copyright © 2009 Don Pellegrino All Rights Reserved.