BioSonics Time Series

Data products for BioSonics echosounders are described here. BioSonics echosounders are bio-acoustic echosounders, nominally tuned to image the water column (in ONC's case, mounted on the seabed, facing upward), revealing information about the biomass contained within, typically migratory zooplankton such as copepods, fish and occasionally, marine mammals. Fisheries experts may use the data to identify and measure fish biomass and behaviour. Echosounders used in this way are often multi-frequency/multi-channel and calibrated so that Target Strength measurements are available to help identify species in the water column, such as the case for the BioSonics echosounders. Visualization and analysis software such as EchoView is often used with this data; EchoView can read and work with BioSonics DT4 files, including reading the calibration information. For users that do have or use EchoView, DT4 files are also readable by BioSonics software (more information in the Formats section). Otherwise, accessible formats in the BioSonics Time Series data product include MAT and CSV text files, plus images/plots available as PNG and PDF files. Future updates may add a netCDF format for this data product.

Please note, these data products are dependent on daily raw log file generation and the conversion of log files to DT4 formats, this occurs once daily, shortly after midnight UTC (4 or 5 PM Pacific time). The raw file generation takes up to a few hours per day and then conversion from log to DT4 formats takes about 20 minutes for one day of data. BioSonics MAT files are generated from DT4 files and archived for faster data product generation, however these files may also be generated on-the-fly, filling any that are not yet pre-generated.

Oceans 3.0 API filterdataProductCode=BSTS

Revision History

Data Product Options

Formats

This data is available in DT4, CSV and MAT file formats, and available as plots in PNG and PDF formats. A common format netCDF for all echosounders is in development, contact us if you'd prefer a netCDF format product. Content descriptions and example files are provided below.

To produce the file, the following notes apply:

DT4

This format is specific to the manufacturer, and consists of sections called tuples (e.g., channel descriptor, time and ping tuples). When using BioSonics data acquisition software, data is normally stored in this way. Although we use custom-built drivers to communicate with our instruments, we can use the raw data in the log file to produce the DT4 file which can be interpreted by BioSonics post-processing software (most recent version available from BioSonics website) in play-back mode, and commercial software like EchoView.

A new file is generated each day, and whenever the driver is restarted. The reference time in the time tuple and the ping time in the ping tuples correspond to the times the measurements were received at the ONC Canada shore station (in place of the internal instrument clock values).

The format and tuple contents are further described in the manufacturer's documentation. However, there are two tuples that are not described in this document, but are regularly included in DT4s.

Oceans 3.0 API filter: extension=dt4

MAT

MAT files (v7) can be opened using MathWorks MATLAB 7.3 (R2006b) or later. A new file is created every hour from the start of the parent DT4 file(s). There is some testing data where the device was pinging much faster than usual, generating a very large MAT file - such files will be version 7.3 but will still work for almost all users. One hour files were selected so that the memory requirements are not too onerous. Opening a one-hour BioSonics MAT file in MATLAB will require about 500 MB of contiguous available memory. Various plotting routines and analysis may cause MATLAB to use up to 2 GB of memory.

MAT files may contain calibrated (Target Strength or Volume Backscatter) data as per the selected data product option. The resample option, known as ensemble/ping averaging, is also applicable. When either or both options are used, MAT files are generated on-the-fly from the stored non-resampled, non-calibrated MAT files or DT4 files, otherwise, MAT files can be retrieved directly from the archive (takes only a second or two to complete the Data Search request). Also when either or both options are requested, the data.vals matrix is recalculated, converting from uint32 format to double (it becomes twice as large on disk and in memory).

With a calibration option applied, the units structure entry for vals (units.vals) will have: '(dB re 1 m^{-1})' for Volume Backscatter,  '(dB re 1 m^{2})' for Target Strength, or 'raw counts' for raw data. The filename will also be modified, see the example MAT file below.

With the ensemble option selected, a new field will appear in the data structure (not shown below): data.countInEnsemble, representing the number of raw pings/data points in each ensemble average. data.snd will also have a new field: data.snd.ensemblePeriod, which is simply the ensemble period in seconds (also not shown below). The filename will also have a modifier with the ensemble period, see the attached example file:

FolgerPassage_FolgerDeep_Echosounder-Bioacoustic_20170911T184130.000Z_20170911T185230.000Z-Ensemble30s_CalibratedTS.mat

Each MAT file contains the following structures: meta, data, units. Entries in italic are not applicable to the type of BioSonics Echosounders deployed, but are standard data fields in the DT4 file specification. Please note that the variables naming scheme here does not conform to our standard (structures should be big camel case for example), but is maintained for backward compatibility / historical reasons.

data: structured array containing the BioSonics data (one structure per configuration in cycle), having the following fields which pertain to the dt4 contents (for details, refer to the manufacturer manual).

name: serial number

frequency: transducer frequency (Hz)

time: datenum vector, UTC timestamp (time source is ONC shore station)

range: vector of distance to each bin

pingnum: ping number 

vals: amplitude of return signal (A/D Counts - main data matrix - data type uint32)

nsamps: number of samples in the ping tuples (read from the ping tuple)

env: structure containing environmental configuration values

snd: structure containing details of the channel descriptor tuple

opmode: operating mode

units: structure containing unit of measure for fields in structures above. For instance, units.lat='degrees N'.

Oceans 3.0 API filterextension=mat

CSV

The CSV (comma separated variables) format file contains similar data to the MAT file. It is based on the CSV format produced by ASL software for AZFP, AWCP and ZAP echosounders as a format that is readily accessible by EchoView's import function. It is a relatively simple text format, The ASL time series data product also offers this format. If users choose either of the calibrated options, they can view the data directly in EchoView without the need for further information. Uncompressed, the CSV format can be quite large and is also the slowest BioSonics format to produce (it is always generated on-the-fly). Here is an example of the CSV files:

FolgerPassage_FolgerDeep_Echosounder-Bioacoustic_20170911T184100.000Z_20170911T185300.000Z-Ensemble60s_EchoView_038kHz.raw.csv

Ping_date,Ping_time,Ping_milliseconds,Range_start,Range_stop,Sample_count
2017-09-11,18:41:30,000,0.017825,99.9966,5610,373212,601876,820720,...
2017-09-11,18:42:30,000,0.017825,99.9966,5610,375332,604154,822206,...
2017-09-11,18:43:30,000,0.017825,99.9966,5610,358626,586227,807488,...

The file-name modifiers include ensemble averaging, the string '-EchoView' (to avoid any possible confusion with other CSV products), the echosounder channel centre frequency and data type: (one of) sv (Volume Backscatter), ts (Target Strength), raw (uncalibrated). 

Oceans 3.0 API filterextension=csv

Plots: PNG and PDF

There are two formats of plots available: PNG or PDF. A PDF plot file can contain multiple plots as separate pages, and the graphics are vector images, which are better for printing or viewing at high resolution.  The PNG format is a single plot per file in a raster image which is good for quick viewing and sharing. The data and appearance of the two plot formats are the same. These plots are also known as echograms, they plot echo intensity (backscatter, target strength or raw counts) vs time and depth, and are basically what you would see as a sonar operator. The calibration data product option switches the values plotted between raw counts (uncalibrated), Volume Backscatter (Sv) and Target Strength (TS), but does not otherwise affect the form and function of the plots. The plots are affected by the ensemble/averaging option, the number of channels, and the sun elevation data product option. As for all formats, plots always break on configuration changes. If plots extend over gaps in the data, users will see the gaps represented by white space.

Oceans 3.0 API filter: extension={png,pdf}

Plots: Daily vs Multi-day plots

There are two variations of plots in terms of duration, depending on the ensemble period option selected. Daily plots are generated with the default (no averaging) option. Daily plots will show a maximum of one day of data per plot. For all plots, the data has to be resampled so that an ensemble or raw ping corresponds to the width of at least one pixel in the PNG (ideally one to one), otherwise rendering the image will alias the data; so in spite of selecting the no-averaging option, some resampling may happen. This resampling is important as normal resizing on computer screens applies linear image anti-aliasing routines which are not appropriate for logarithmic scale images. The minimum ensemble period for a one day plot is ~30 seconds as we have about 2560 pixels along the time axis in these plots. If you select less than one day, you can effectively zoom in and see higher temporal resolutions. 

If an ensemble period is selected by the user (other than than the none option), the plots will be multi-day plots and only one plot will be generated over the search time range (excluding configuration changes and data/memory limits, which break the plot). Ensemble averaging will be applied as selected, except when the selected ensemble period is not high enough to prevent aliasing and distortion, the ensemble period will be increased automatically. Below are an example of a daily echogram vs a multi-day echogram for the same, very short chunk of poor test data:


Daily Echogram ExampleMulti-day Echogram Example 

The different averaging ensemble period leads to slightly different data and colour scale ranges.

Plots: Single vs. Multiple Channels

If the echosounder has multiple channels, as in the examples above, each channel will be plotted as a subplot, with independent axis and limits (axis limits are set from fixed intervals, e.g. every 20 dB, to facilitate inter-plot comparison). In addition, if the device has precisely 3 channels, an additional RGB composite plot will be shown. For the RGB subplot, each channel's data is represented by a primary colour, and the colours are combined to form an image. In this way, users can see composite details: various targets will appear as different colours, depending on their relative target strength as a function of frequency. This is useful for differentiating the targets between fish, zooplankton, bubbles, whales, etc. The examples above show the RGB composite (not very interesting there, but it is very interesting with real data).

Using the MAT file

Unlike the .DT4 files, mat files cannot usually be visualized directly by commercial software. Instead, these files are intended for analysis using MATLAB. In the data structure for each channel, the vals field that contains the raw A/D counts can be easily converted into target or volume backscatter strength (via a log transform), depending on what information is required. The target strength is the strength of reverberation from single, large targets such as fish, while the volume backscatter strength is for clouds of scatterers such as plankton. Please consult a good textbook on underwater acoustics for more information, for example: H. Medwin & C. S. Clay, Fundamentals of Acoustical Oceanography (Academic, Boston, 1998). Since the amount of data is massive, it would not be practical to include calculated values for all three types of data. However, any MATLAB user can quickly calculate the target or volume backscatter strength from the raw A/D counts, or they can select the calibrated data product option as detailed above. Here is some example MATLAB code that calculates and plots the data for channel 2 from a non-calibrated, non-resampled MAT file:

% load a BioSonics MAT file either drag and drop or use something like:
%uiload
chn = 2;
vals = double(data(1,chn).vals);
 
% plot the A/D vals
xStartDay = floor(min(data(1,chn).time));
x = (data(1,chn).time - floor(min(data(1,chn).time)))*24*60;
y = fliplr(data(1,chn).range.');
figure;
imagesc(x,y,flipud(vals))
set(gca, 'ydir', 'normal')
ylabel('Range (m)')
xlabel(['Minutes since ' datestr(xStartDay,'dd-mmm-yyyy') ' 00:00:00'])
title('A/D values')
 
% Calculate the receive pressure - just for fun
p = vals/10^(data(1,chn).snd.rxee.rs/200);
 
% Calculate and plot the volume backscatter strength
a = data(1,chn).env.absorb; %calculated from environmental conditions in dB/m for each channel/frequency
psi = data(1,chn).snd.rxee.bwy/20*data(1,chn).snd.rxee.bwx/20*(10^-3.16);
Sv = 20*log10(vals) -...
    (data(1,chn).snd.rxee.sl + data(1,chn).snd.rxee.rs + data(1,chn).env.power)/10 +...
    repmat(20*log10(data(1,chn).range) + 2*a*data(1,chn).range, 1, size(vals,2)) -...
    10*log10(data(1,chn).env.sv*data(1,chn).snd.pulselen/1000*psi/2) +...
    data(1,chn).snd.ccor/100;
 
figure;
imagesc(x,y,flipud(Sv))
set(gca, 'ydir', 'normal')
ylabel('Range (m)')
xlabel(['Minutes since ' datestr(xStartDay,'dd-mmm-yyyy') ' 00:00:00'])
title('Volume Backscatter Strength')
 
% Calculate and plot the target strength
TS = 20*log10(vals) -...
    (data(1,chn).snd.rxee.sl + data(1,chn).snd.rxee.rs + data(1,chn).env.power)/10 +...
    repmat(40*log10(data(1,chn).range) + 2*a*data(1,chn).range, 1, size(data(1,chn).vals,2)) +...
    data(1,chn).snd.ccor/100;
 
figure;
imagesc(x,y,flipud(TS))
set(gca, 'ydir', 'normal')
ylabel('Range (m)')
xlabel(['Minutes since ' datestr(xStartDay,'dd-mmm-yyyy') ' 00:00:00'])
title('Target Strength')