How Do Biogenetics Databases Like NCBI's Blast Work?

  • Thread starter saysua
  • Start date
In summary, The conversation discusses the use of the Blast database, which is a program used to query for matching protein or gene sequences. The results of the search provide data on species with similar sequences and alignments. The letters in the protein sequence represent amino acids, with the "+" sign indicating a match and spaces or gaps indicating mismatches. The query is the protein being searched for and the subject is the protein in the database. BLAST is not a database itself, but a program that helps with sequence alignment. Further insights and resources for learning about this database are requested.
  • #1
saysua
2
0
I think those who are well-immersed in the scientific community are familiar with the biological databases connected through the website www.ncbi.nlm.nih.gov.

I was just wondering if someone could possibly explain to me a little about the Blast database for me. IN a particular section of a homework assignment, Blast is needed to query for matching protein/gene sequences. The search for an unc. C elegan prtoein/gene sequence resulted in a bunch of data on species that have similar protein/gene sequences and alignments.

An example for the protein sequence results are:

Query 1 MEHEKDPGWQYLRRTREQVLEDQSKPYDSKKNVWIPDPEEGYLAGEITATKGDQVTIVTA 60

MEHEKDPGWQYLRR+REQ+LEDQSKPYDSKKN WIPDPEEGYLAGEITATKGDQVTIVTA

Sbjct 1 MEHEKDPGWQYLRRSREQILEDQSKPYDSKKNCWIPDPEEGYLAGEITATKGDQVTIVTA 60


My questions are;
are each of the letters codons of amino acids?
In terms of its relationship to each other and to my initial protein query, what is query 1 and subject 1?

I'll deeply appreciate it if you could provide further insights into this database. (maybe there's a crashcourse or something on it somewhere?) My professor seems to be skimming ont he surface of the importance of these databases (blast/wormbase/flybase), but I want to know more about it since there is a good probability that I might find myself using it in the future.

Many thanks in advance!
 
Biology news on Phys.org
  • #2
Each Letters represent an amino acids.
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/A/AminoAcids.html

the "+" sign represent a match of amino acid that are in the same functional family but are different. Any space or gap represent a mismatch.

Query is the protein that you "BLASTED", subject is the protein in the data bank. the number is for the amino acid not for query of subject.

As far as talking about the database, BLAST is not a database, genbank is the database. BLAST stand for Basic Local Alignment Search tool. So BLAST is program that help you align sequences and find the best match.

If you have any question, don't be shy ask the questions. I have done a bioinformatics course and I have been using the BLAST and other database for a few year now.
 
  • #3


Biogenetics databases, such as the one hosted by the National Center for Biotechnology Information (NCBI), are essential tools for researchers and scientists in the field. These databases contain a vast amount of biological information, including DNA and protein sequences, gene expression data, and genetic variations. The NCBI database, in particular, is a comprehensive resource that connects various databases and tools, making it a valuable tool for researchers.

One of the most commonly used tools in the NCBI database is Blast, which stands for Basic Local Alignment Search Tool. Blast is used to search for similarities between nucleotide or protein sequences in the database. In your example, the protein sequence "MEHEKDPGWQYLRRTREQVLEDQSKPYDSKKNVWIPDPEEGYLAGEITATKGDQVTIVTA" is being queried against the database to find similar sequences.

Each letter in the sequence represents an amino acid, which is the building block of proteins. So, in a way, each letter can be considered a codon of an amino acid. The query and subject sequences in your example are simply different versions of the same protein sequence, with some variations in the amino acids. Blast uses algorithms to align these sequences and find the best matches, which can provide valuable insights into the function and evolution of proteins.

As for your question about finding resources to learn more about these databases, the NCBI website itself has a lot of resources and tutorials that can help you understand how to use their databases and tools effectively. Additionally, there are many online courses and tutorials available on websites like Coursera and YouTube that can provide a more in-depth understanding of biogenetics databases and their applications.

In conclusion, biogenetics databases, such as the one hosted by NCBI, are crucial tools for researchers and scientists in the field. Blast, in particular, is a powerful tool for finding similarities between biological sequences, which can provide valuable insights into the function and evolution of proteins. With the vast amount of biological data being generated every day, these databases will continue to play a crucial role in advancing our understanding of the biological world.
 

What is biogenetics?

Biogenetics is the study of how genes and genetic information influence the development and function of living organisms.

What are biogenetics databases?

Biogenetics databases are digital repositories that store genetic information and data, such as DNA sequences, gene expression data, and genetic variation data, from various organisms. These databases are used by scientists to analyze and compare genetic data to gain insights into various biological processes.

How are biogenetics databases used?

Biogenetics databases are used by scientists to store and access genetic information, which can then be used for research and analysis. They can also be used to identify patterns and relationships between genes and traits, as well as to discover new genes and genetic variations that may be linked to specific diseases.

What are the benefits of using biogenetics databases?

The use of biogenetics databases allows for more efficient and accurate analysis of genetic data. It also promotes collaboration and data sharing among scientists, leading to faster and more comprehensive research. These databases also play a crucial role in the development of personalized medicine, as they can provide valuable insights into an individual's genetic makeup and potential health risks.

What are the potential ethical concerns surrounding biogenetics databases?

Some potential ethical concerns surrounding biogenetics databases include privacy and security of genetic information, potential discrimination based on genetic data, and the use of genetic data for profit without the informed consent of individuals. There is also the risk of misinterpretation or misuse of genetic data, which could have harmful consequences. Therefore, it is essential to have strict guidelines and regulations in place to protect the ethical use of biogenetics databases.

Back
Top