Querying the database

Firstly make sure you have downloaded and built the database.

Below you can find basic examples. For more complex examples, please refer to the complete documentation.

taxids

Several operations on taxids are available in taxadb:

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>>    'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>>    'names': SciName(dbtype='sqlite', dbname=dbname),
>>>    'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }

>>> taxid2name = ncbi['taxid'].sci_name(2)
>>> print(taxid2name)
Bacteria
>>> lineage = ncbi['taxid'].lineage_name(17)
>>> print(lineage[:5])
['Methylophilus methylotrophus', 'Methylophilus', 'Methylophilaceae', 'Nitrosomonadales', 'Betaproteobacteria']
>>> lineage = ncbi['taxid'].lineage_name(17, reverse=True)
>>> print(lineage[:5])
['cellular organisms', 'Bacteria', 'Pseudomonadati', 'Pseudomonadota', 'Betaproteobacteria']

>>> ncbi['taxid'].has_parent(17, 'Bacteria')
True

You can also get the taxid from the scientific name

Get the taxid from a scientific name.

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>>    'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>>    'names': SciName(dbtype='sqlite', dbname=dbname),
>>>    'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }

>>> name2taxid = ncbi['names'].taxid('Pseudomonadota')
>>> print(name2taxid)
1224

Automatic detection of old taxIDs imported from merged.dmp.

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>>    'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>>    'names': SciName(dbtype='sqlite', dbname=dbname),
>>>    'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }

>>> taxid2name = ncbi['taxid'].sci_name(30)
TaxID 30 is deprecated, using 29 instead.
>>> print(taxid2name)
Myxococcales

If you are using MySQL or postgres, you’ll have to provide your username and password (and optionally the port and hostname):

>>> from taxadb2.taxid import TaxID

>>> taxid = TaxID(dbype='postgres', dbname='taxadb',
                    username='taxadb', password='*****')
>>> name = taxid.sci_name(2)
>>> print(name)
Bacteria

accession numbers

Get taxonomic information from accession number(s).

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>>    'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>>    'names': SciName(dbtype='sqlite', dbname=dbname),
>>>    'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }

>>> my_accessions = ['A01460']
>>> taxids = ncbi['accessionid'].taxid(my_accessions)
>>> taxids
<generator object AccessionID.taxid at 0x103e21bd0>
>>> for ti in taxids:
    print(ti)
('A01460', 17)

Using configuration file or environment variable

Note: This part was only tested sporadically as compared to the original implementation taxadb

Taxadb2 can now take profit of configuration file or environment variable to set database connection parameters.

  • Using configuration file

You can pass a configuration file when building your object:

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> config_path = "taxadb2/test/taxadb2.cfg"
>>> ncbi = {
>>>    'taxid': TaxID(config=config_path),
>>>    'names': SciName(config=config_path),
>>>    'accessionid': AccessionID(config=config_path)
>>> }

>>> ncbi['taxid'].sci_name(2)
Bacteria
>>> ...
  • Configuration file format

The configuration file must use syntax supported by configparser object. You must set database connection parameters in a section called DBSETTINGS as below:

Here you can see one example using sql

[sql]
dbname=taxadb2/test/test_db.sqlite
username=
password=
hostname=
port=
dbtype=sqlite

Some value will default it they are not set.

hostname will be set to value localhost and port is set to 5432 for dbtype=postgres and 3306 for dbtype=mysql.

  • Using environment variable

Taxadb2 can as well use an environment variable to automatically point the application to a configuration file. To take profit of it, just set TAXADB2_CONFIG to the path of your configuration file:

(bash) export TAXADB2_CONFIG='taxadb2/test/taxadb2.cfg'
(csh) set TAXADB2_CONFIG='taxadb2/test/taxadb2.cfg'

Then, just create your object as follow:

>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> ncbi = {
>>>    'taxid': TaxID(),
>>>    'names': SciName(),
>>>    'accessionid': AccessionID()
>>> }

>>> ncbi['taxid'].sci_name(2)
Bacteria
>>> ...

Note

Arguments passed to object initiation will always overwrite default values as well as values that might have been set by configuration file or environment variable TAXADB2_CONFIG.