Create a database

Build you own database

In order to create your own database, you first need to download the required data from the ncbi ftp. The following command does it for you:

taxadb2 download --outdir taxadb --type taxa

where --outdir refers to an output directory where to download the data.

SQLite

The following command will create an SQLite database in the current directory.

taxadb2 create --division taxa --input taxadb --dbname taxadb.sqlite

MySQL

Note: The part PostgreSQL and/or MySQL was not tested as compared to the original implementation taxadb, but might still work as expected.

Creating databases is a very vendor specific task. Peewee, as most ORMs, can create tables but not databases. In order to use taxadb with MySQL, you’ll have to create the database yourself.

mysql -u $user -p
mysql> CREATE DATABASE taxadb;

Then, fill database with data

taxadb2 create -i taxadb --dbname taxadb --dbtype mysql --username user --password secret --port 3306 --hostname localhost

PostgreSQL

Note: The part PostgreSQL and/or MySQL was not tested as compared to the original implementation taxadb, but might still work as expected.

Creating databases is a very vendor specific task. Peewee, as most ORMs, can create tables but not databases. In order to use taxadb with PostgreSQL, you’ll have to create the database yourself.

psql -U $user -d postgres
psql> CREATE DATABASE taxadb;

Then, fill database with data

taxadb2 create -i taxadb --dbname taxadb --dbtype postgres --username user --password secret --port 5432 --hostname localhost

The following options have default value if not set on the command line:

  • --port (5432 for PostgreSQL, 3306 for MySQL)

  • --hostname (localhost)

For more information about all the available options, please type:

taxadb2 download --help
taxadb2 create --help

Warning

When building your database with downloaded data, you can increase the speed of data loading by using –fast option. This option avoid checking existence of each accession id in the database before loading related info. In certain case this may lead to duplicate entries in table accession when loading the same file twice for example.