Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

We have recently switched our primary storage format for Audio Data from compressed wav files to FLAC files. FLAC is more space and CPU efficient, and is widely supported. As of April 9, 2021, all hydrophones produce FLAC files as source. Prior to that, there was a transition period when FLAC files were generated from the source wav files and archived as well, that period was August 10, 2020 to April 9, 2021. During this transition period, both wav and FLAC formats were archived. File formats that are archived are readily available, while other formats are generated on-the-fly and are much slower to access. For fast results for lossless audio data, select wav file formats up to August 10, 2020, and select FLAC thereafter. The data availability graph in Data Search maybe updated in the future to show which files are available directly, while the archivefiles service can offer a list of files in the archive as well.

If users create large requests that generate FLAC/MP3/WAV formats on-the-fly, the temporary space to hold these data products may fill up and their search requests will be cancelled and the data deleted. Searches may also be cancelled for very large search bounds if the search will take too long complete without being interrupted by semi-monthly software maintenance. Please see the above date ranges to create searches that only retrieve data from the archive and avoid generating data products on-the-fly. While the temporary / holding space is quite large, it is easy to request over 10 terabytes of data in a single search. We're working on a more permanent fix for this issue. Please accept our apologies if your search requests are impacted. Cancellations will only be done if necessary and the user will be contacted via email.

Another very good workaround is to request data in smaller "chunks" of time. Users can also do this programmatically using the dataProductDelivery Service (Oceans 3.0 API), so that users can download and process the data as they go. Another alternative is to run processing code or software on the ONC cloud. This avoids downloading the data entirely, removing the limitations of local computing resources. Contact us if you're interested.


Info

On-the-fly conversion between audio formats is disabled by default. Searches may return no data found for some formats while there is data in other audio formats. See "Audio Format Conversion" below.

Given the sensitive nature of hydrophone data, the military has the ability to divert the audio data as required. Diverted data is then reviewed by military authorities, and if it does not contain sensitive recordings it is returned to the ONC archive. This process usually blocks out a few days of near-live data and then the review and return takes up to 3 months (usually about a month). Diversion occurs securely at the source and ONC has no access to the data until it is reviewed and returned. The data products, Data Preview and data availability are all updated automatically when the data is returned. Because hydrophones generate multiple types of data, the data availability plot in Data Search may show available data when audio data is not available. This often indicated by the tones of the colour in the data availability plot:

...