Tetraselmis suecica genome assembly

Genome assembly of the green microalgae Tetraselmis suecica from ONT Nanopore sequencing and HI-C sequencing

 

Simple

Title

Tetraselmis suecica genome assembly

Date (Publication)
2019-05-28
Citation identifier
FR-330-715-368-00032-IFR_BIOINFO_GENOME_TETRASELMIS_SUECICA
Abstract

Genome assembly of the green microalgae Tetraselmis suecica from ONT Nanopore sequencing and HI-C sequencing

Credit

Ifremer - Laboratoire Ecotoxicologie et Laboratoire Physiologie et Biotechnologie des Algues

Point of contact
Organisation name Individual name Electronic mail address Role

IFREMER

Sussarellu Rossana

Rossana.Sussarellu@ifremer.fr

Author

IFREMER

Carrier Gregory

Gregory.Carrier@ifremer.fr

Author

IFREMER, Laboratoire GENomique et fonction des microALGues (GENALG)

Sussarellu Rossana

Rossana.Sussarellu@ifremer.fr

Publisher

GEMET - INSPIRE themes, version 1.0

  • Habitats and biotopes

Thèmes Sextant

  • /Biological Environment/Bioinformatics

ODATIS aggregation parameters and Essential Variable names

  • Bioinformatique

Access constraints
Restricted
Use constraints
Restricted
Language
Français
Character set
UTF8
Topic category
  • Environment
Reference system identifier
EPSG / WGS 84 (EPSG:4326) / 8.6
Distribution format
Name Version
OnLine resource
Protocol Linkage Name

WWW:DOWNLOAD-1.0-link--download

https://data-dataref.ifremer.fr/bioinfo/ifremer/brm/tetraselmis-suecica/

Lien de téléchargement

NETWORK:LINK

/home/ref-bioinfo-public/ifremer/brm/tetraselmis-suecica
OnLine resource
Protocol Linkage Name

WWW:LINK-1.0-http--metadata-URL

https://doi.org/10.12770/c60518a3-ad2a-4f8b-8437-52f44608f6e4

Digital Object Identifier (DOI)

Hierarchy level
Dataset
Statement

The marine microalgae Tetraselmis suecica (T. suecica) CCMP 904 was obtained from the Provasoli–Guillard National Center for Marine Algae and Microbiota (NCMA). One-liter T.suecica cultures were grown in 2-L round borosilicate sterile glass flasks filled with 1 L of sterile Conway medium (also called Walne’s Medium, Walne, 1970 in Anderson 2005), inoculated at an initial concentration of 100 000 cell.mL-1 and maintained in a thermo-regulated room at 21 ± 1°C, under a light intensity of 150 ± 5 µmol m-2 s-1, with a dark:light cycle of 8:16 h, under constant air bubbling enriched with carbon dioxide. Cultures were grown during six days before cell harvesting. The day before sampling, the cultures wereas treated with an antibiotic antimycotic solutions (A5955 Sigma-Aldrich, 1 mL per litre of cell culture). Cell harvesting was undertaken by gently centrifuging 6 days-old cultures (200 rpm 10’ 4°C). Cell pellets were rinsed twice with filtered seawater.


Three runs on R9.4.1 flow cells were performed on Minion, (First run: 1.33 Gb, 173 063 reads, mean length 7 658 bp, mean quality score 9.8; Second run: 356 Mb, 158 984 reads, mean length 2 238 bp, mean quality score 8.7; Third run: 746 Mb, 1 947 776 reads, mean length 383 bp, mean quality score 9.2). Hi-C sequencing was performed on 2x36bp Illumina Myseq (collaboration with Laboratoire CNRS – UMR3525, Régulation Spatiale des Génomes, Département de Génomes et Génétique, UMR: Génétique des génomes).


Initially, the read quality was controlled with NanoPlot v1.28.0 (https://doi.org/10.1093/bioinformatics/bty149). A run of trimming was done with Porechop v0.2.4 (https://github.com/rrwick/Porechop) to remove adaptors and low-quality read was filtered with NanoFilt v2.6 (https://doi.org/10.1093/bioinformatics/bty149). This preprocess step has been done individually for each set of reads, read of the first and second run with an average quality lower than 9 (phred scale) and length lower than 1,000 nt were removed and read of the third run with an average quality lower than 9 and length lower than 100 nt were removed (due to a low average read length on this third run). Then the three set of filtered reads were merged together to form a single set of data.


This set of reads was assembled with the de novo long-read assembler Flye v2.5 (https://doi.org/10.1038/s41587-019-0072-8). Then a polishing step was done using RNA short-reads to correct the assembly with Pilon v1.23 (https://doi.org/10.1371/journal.pone.0112963). A quality control of the final assembly was done using Quast v5.0 (https://doi.org/10.1093/bioinformatics/btt086) for basic metrics, Blobtools v1.0.1 (https://doi.org/10.12688/f1000research.12232.1) to check if the assembly contain any contamination and Busco v3.2 (https://doi.org/10.1007/978-1-4939-9173-0_14) for single copy orthologs gene completeness. The database used for Busco was Eukaryota_odb9 and consist of 303 single copy orthologs genes.


And finally, a scaffolding was done using Hi-C sequencing data using instaGRAAL (https://doi.org/10.1038/ncomms6695) followed by a quality control with Quast and Busco.

Metadata

File identifier
c60518a3-ad2a-4f8b-8437-52f44608f6e4
Metadata language
English
Character set
UTF8
Hierarchy level
Dataset
Date stamp
2024-07-17T14:51:00.612Z
Metadata standard name

ISO 19115:2003/19139 - SEXTANT

Metadata standard version

1.0

Metadata author
Organisation name Individual name Electronic mail address Role

IFREMER

Sussarellu Rossana

Rossana.Sussarellu@ifremer.fr

Local service desk
 
 

DOI
10.12770/c60518a3-ad2a-4f8b-8437-52f44608f6e4

accessData

 

Overviews

Overview

Tags

GEMET - INSPIRE themes, version 1.0
Habitats and biotopes
ODATIS aggregation parameters and Essential Variable names
Bioinformatique
Thèmes Sextant
/Biological Environment/Bioinformatics