![]() The web application visualizes the data through bar charts, table coloring, and categorization, and it is capable to link annotations to their source databases’ websites automatically.ģ.2 Visual Studio Code and Programming LanguagesĪccording to Guo (2014), Python and Java are the top two popular programming languages in US undergraduate education. Both applications have three query modes across all different-typed files in the original dataset: querying about a gene ID for all related genome, transcriptome, variations, and annotation data querying about all gene IDs on file whose annotations contain some given keywords querying about all gene variation information on a certain portion of a chromosome. The products of our case study include a SQLite database file (.db), a Flask-framework-based web application, and a supplementary, packaged, command-line-based Java application (.jar). R, Microsoft Excel, and different text editors were also used to process data. SQLite database was chosen because it requires fewer computer resources and it has easy connections to Java and Python programs. The source dataset was adopted from a journal published on Molecular Plant ( Shen et al., 2020, The Chromosome-Level Genome Sequence of the Autotetraploid Alfalfa and Resequencing of Core Germplasms Provide Genomic Resources for Alfalfa Research). In this article, a case study on Alfalfa ( Medicago sativa) was conducted with MacOS 11.2 and CentOS 7. ![]() Thus, integrating the genome, variation, transcriptome, and annotation data of a species into a relational database, and then utilizing a personalized website application to interact with it could be a better approach for research labs to manage their datasets. However, those platforms may not be customized or convenient enough to meet every research group’s needs. It is inconvenient and time-consuming to do cross-file searches and analyses on these data without help from some complex online platforms or tools such as BLAST ( U.S. For many species with complex chromosomal compositions, these data can be fragmented and enormous as a whole. There are many other text-based annotations with different file extensions like “.annotate”, “.pfam”, and “.interpro” as well. Serval standard data file formats, such as the FASTA Format (.fasta), the General Feature Format (.gff), and the Variant Call Format (.vcf), are used world-widely to store raw genetic sequences, genomic annotations, and population variation data ( Gourlé, n.d.). ![]() The field of bioinformatics has been brought to a new era as web server technologies, internet services, and database management tools develop rapidly in recent years.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |