Sessions 1 and 2 DEM Specifications and Data Management

Need to create a table and determine the following items:

  • extent - data needs to extend beyond the edge of where you want to go.
  • projection/coordinate system
  • format
  • cell registration (the point on each cell for reference - centre or corner). ArcMap by default uses the centre point. Merging DEMs needs to take this into account. ArcGIS will automatically use pixel registration. Clipping tool has variables where you can align to the pixels so that there is no minor offset. Most DEM data will need to be converted to grid registration (view in header information of asc). Use gdalinfo for viewing input file information. Having consistent pixel size will also help with consistency
  • units of measure for all data
  • Who else can use the product and what are their needs (builds support for the product)?
  • archival plan with ability to rerun the process

Steps

  1. Open ArcCatalog and ArcMap
  2. Save ArcMap as new MXD document
  3. Set map coordinate system of data frame to EPSG 4326 (WGS 1984)
  4. Add a new bounding box polygon to the map
  5. Add an imagery basemap to get an idea of the area
  6. Make the bounding box the only selectable layer
  7. Start editing features and draw a bounding box around Port Alberni
  8. Reset the boundaries to -124.9 to -124.78 and 49.28 to 48.95
  9. Add a buffer polygon that has a buffer of 0.05 degrees

Sessions 3 and 4

Consult the file structure outline in link provided

  • Naming of files should retain the source IE CHS_Hydro data transformed using mean high water should be CHS_trans_mhw. Make the file naming structure easy to understand
  • a PDF of a webpage should be included as well as links as links can change
  • Finding data can be challenge. Try a google search as an initial step to see what comes up and then link those sources
  • Save all raw data in the archived files so that the model can be reprocessed, if necessary
  • Manage file sizes as you go along by compressing data and keep what you use
  • Setup a kind of "breadcrumb trail" of files along the processing steps so that certain steps can be changed
  • Keep a spreadsheet for each data source and maintain all information with filenames, vertical datum, horizontal data, resolution, units, contact information.
  • Look at the area identified by the modelers. Ask questions like what features are they interested in, where is the tide gauge, and are there any schools in the area? If there are evacuation routes, where are they?  
  1. Setup a file directory where the file structure is as follows ProjectName as Area "PortAlberni_v1" then "data", "software" and "documentation", Under data, create subdirectories by Source "ONC_Multibeam", "CHS_Multibeam", etc
  2. Create the same structure in Linux using some common commands: ls = list files, mkdir = make directory, and cd = change directory

Working with unclassified LIDAR data

Open Global Mapper

Select own data files

Clear all classifiers and then select the first "Created, never classified" and click ok

Open _g files and select classifier as ground. See the difference. These data have been cleaned using lp360 software

This about resolution when editing files as the effort may not be worth the details

 

Viewing chart data

  1. Open wms server to see charts:
  2. In arcmap, click add data
  3. from dropdown, select GIS servers
  4. select WMS server and follow prompts
  5. Enter the link: http://geoportal-geoportail.gc.ca/arcgis/services/Hydrographic_Charts_ENG/MapServer/WmsServer?

 

 Viewing Polygon data

  1. Open all polygons in the coverage area in ArcMap
  2. overlay them to check for coverage and where other data may need to be sourced
  3. Exports of this may help to motivate access to missing areas or new surveys

Creating Bounds

  1. Switch to linux
  2. change directory to PortAlberni>data>bathy>ascii_surveys
  3. type "bounds" to get help for the program This program is written in C to create a boundary polygon for each dataset
  4. Convex hull is usually good enough for type (-v) but concave can be used where there are many turns

Input > bounds -v 0.01 2011AlberniInlet_5m_Final_5028081_wgs.xyz  > 2011AlberniInlet_5m_Final_5028081_wgs.gmt

5. or setup as a for loop:

for I in *xyz; do bounds -v 001 $i > $basename $i .xyz).gmt; done

A few other useful tools:

Output file contents

> cat filename

convert vector data

> ogr2ogr --formats

Convert raster data

use gdal

Creating a Coastline

We are using the nautical chart coastline that CHS has already made and modify that for the needs of the coastal areas.

The tiles from the charts can be found as discussed the day before. We will need to reload the WMS and that will probably load it. Building coastline is quite demanding sometimes may take as long as a week. We will focus on the harbour of Port Alberni. 

Chart is CFI70168, web services are down now.

We can use Arc catalog to export as a shape file that we can use in ArcMap. Open arc catalog, in the directory tree go to the data files session 6 with coastlines and the chart will be in there. If we don't see it, go to customize and in extensions click all the extensions. Use the best imagery you have, but now we will use this one as a start.

We can zoom and compare the coastline to the features there. Part of the discussion with the modelers are the features that they want to add to the model. Jetties and breakwaters are kept but piers may be removed or kept depending on the size. 

Go to coastline select as only selectable layer. Areas need to be connected for the line, avoiding gaps in the coastline. This is one continuous line for the DEM and the DEM buffer box. We will make a polygon out of this and this can be compared with imagery and bathy-topo or topo. We can make it a pallette that changes at the zero line and that will be easier to do. This gives no information on data, but the coastline is being used mostly as a mask for data points such as liar returns over the water or fake points on line. We can use it as source data file if we have no bathy-topo lidar.

to connect the points double click on the line and use the continue feature tool from the editing toolbox. There are high water and low water line, these can be used for BC Lowest Low Water Large Tide (LLWLT) and Highest High Water Large Tide (HHWLT) . We can use this, but the high water one doesn't get you to zero. Street view may help differentiating if you can see under the bottom of the structure, to see if there is a gap. The s57 coding distinguishes between breakwater, pier, etc. In this coast most of them have water underneath. Army Corp of Engineer keep autocad files from the structures that may also help.

In areas with sparse bathymetry the coastline becomes more important because it guides you for this. A lot of this work is based on knowledge of the area, so either we visit, or have a person in the area or check images of the area to build some of the structures by hand if not shown. There is also a lot of debate about including buildings or trees, although tsunami modellers normally prefer to eliminate buildings so that we get worse case scenario. Culverts are also something to check. Areas in salt flats in San francisco are separated by some levi which water doesn't cross or may cross, we need to check on this for the DEM.

Changes over time on the coastline also affect the DEM and these changes may not be captured in all the data that we have available. There may be bathy in an area where sedimentation has covered the coast or vice versa.

Part II of Coastlines

Now that we have a coastline (we imported the BC coast) we used the clip tool to clip (under geoprocessing) the coastline for the buffer bounding box that we created yesterday at the beginning of the day.

Part III of Coastlines

Editing tools, densify tool will add vertices around the polygons so we can add around our cell sizes, a good value for this will be 2m or 9*10^3 decimal degrees, default is ok. Then we go to data management tools, features and feature vertices to points. This will take every vertices in the polygon and convert to a point file (point all).

You need to add a field for Z when saving because so far it only has X and Y, so the Z value -0.01 is a better transition sometimes for this, instead of setting it to 0. All values are set the same and this is like stapling the grid to a value for the coastline. This is normally using the same value. We can use the add field tool from the toolbox to add this Z. 

From CHS data we could use the MinZ in the coastline file to force the Z value to those values instead of the -0.01. In the Pacific coast the tidal range is 8m and seeing this up to zero would make a huge difference. In this case we use 0 because this dataset has a low priority value and it is an approximation to solve the lack of data.

Export from the attribute to text file to gain a XYZ file of coastline points.

Part IV of Coastline

We are now moving to Linux and transfer that XYZ file. We want to cd to the data folder

We will use a program called "awk". 

We will type: awk '{print $1,$2,0}' pa_coast_pts3.xyz > pa_coast_pts4.xyz.

this helps us to make sure that everything is space delineated and formatted corrected giving the z value that we want. awk goes line by line and it will print 1st, 2nd value and the 3rd becomes a 0. So we change the -0.01 to 0. This program can also do calculations. This could be done in ArcGIS too. 

 

Data processing

For cleaning LIDAR data we can use two different methods. In LIDAR the water surface is less dense and about 1m lower, so using those assumptions we can filter out the water surface.

Session 7 there are two files, .las and .tif we can look at the .tif in windows. We are going to remove various surface points in the Alberni Inlet. We are using a program called lasfilter that is available on the linux. It is open source. We could also use the coast polygon to clip out the water point. We will use the filters. 

this is an example of how it is easier to write several commands into a script and run it all together. 

Go to: home/onc10dem/Projects/PortAlberni/data/DEM_workshop_data_ml/ses7_dataproc/pa_data/raw/topo

.las files change how points are classified but do not overwrite. It helps classifying and then you can move it. In most cases the data will be already classified, and they will classify the ground. A lot of the times the water surfaces are not classified anyway. 

We are going to use lasfilter. 

  1. We start by reclassifying all water points to zero by typing: lasfilter Ground_2M_Grid_092FO16.las -f0:0
  2. Next we classify all the points below 0.5m,Points below 0.5 to classification 9. Those are points on the water surface: lasfilter Ground_2M_Grid_092FO16.las -f 3:0.5:9
  3. Now we will reclassify based on the densities of the points all the points that have a density less than 10 puts per 10 sq meters to class 12: lasfilter Ground_2M_Grid_092FO16.las -f 5:10:0:10:12
  4. Reclassify class 0 to class 2 (ground): lasfilter Ground_2M_Grid_092FO16.las -f 0:2 -c 0
  5. Reclassify all ground points (2) that have a density less than 10 pts per 10 sq meters to class 12, this is: lasfilter Ground_2M_Grid_092FO16.las -f 5:10:0:10:12 -c 2
  6. Reclassify all ground points lower than 1m to class 8 lasfilter Ground_2M_Grid_092FO16.las -f 3:0:1:8 -c 2
  7. Reclassify all ground points that have a density less than 25 puts per 20 sq meters: lasfilter Ground_2M_Grid_092FO16.las -f 5:20:0:25:12 -c 2
  8. Reclassify all points higher than 5 m from the class used to hold sparse points: lasfilter Ground_2M_Grid_092FO16.las -f 3:1:5:2 -c 12

We could also use, this is more similar to a clipping: lasfilter -p bc_coastpoly.shp -f 0:11 Ground_2M_Grid_092F016.las

There is normally a huge variety that goes into combining the DEM and will need to identify what changes are needed in each and what is the processing and software and process to a level that is good for a DEM. The more data files that go into it the more work that needs to be done on it too. Data types are really variable depending on the source of the data.

For example, when they get canal profiles they need to densify before to avoid them to show as ripples. In some cases there are a lot of options and a lot of data that is good, if there is not that much data we will be limited. Some places won't release raw data, but we can always ask.

After filtering we need to convert the file to xyz by doing: las2xy  Ground_2M_Grid_092FO16.las

There is not much documentation but Matt is sending this info around and also this will be updated in our website.

LAStools is another toolbox but if they find you use it there is a Hall of Shame... and you can buy your way out of the list! 

We zip it an then move it to the windows side. We also look at the file in the windows side. The unzipping and loading on ArcGIS takes quite a bit of time. We can transform the file so that it is less hard to load.

Move to other folder the file: cp 2011AlberniInlet_5m_Final_5028081.asc ~/Projects/PortAlberni/data/bathymetry/

The organized file structure helps to get the files in the right place and move around.

The files we got are in UTM. We could project to WGS84 if we wanted. 

Using GDAL to transform to XYZ. This will translate raster formats, it is open source and ArcGIS and FME uses this exact program for raster format, so you are getting the same as open source.

The command gdalinfo gives a lot of the information in a easier to read way than looking at the header of the file

To change the format we can use: gdal translate 2011AlberniInlet_5m_Final_5028081.asc 2011AlberniInlet_5m_Final_5028081.xyz -of XYZ

This helps you transform any type of transformation. 

We are going to remove some of the data from the 2011AlberniInlet_5m_Final_5028081.xyz with a program called grep that will return the data file without the no data values that take a lot of space and are harder to load. 

the command is: cat 2011AlberniInlet_5m_Final_5028081.xyz | grep -v "3.402823...e+38" | cs2cs -f "%.6f" +proj=utm +zone=10 +ellps=GRS80 +to +proj=longlat +datum=WGS84 +ellps=WGS84 > 2011AlberniInlet_5m_Final_5028081_wgs84.xyz

[note when files are already in the correct coordinate system: grep -v "\-9999" filename > new filename also seems to work]

The %.6f specifies the precision of the data. You don't normally need to go higher than that, 6 is normally enough for decimals.

The previous command removes no data, changes coordinates and we can also include a single value datum conversion if we had it. 

Note that 3.402823...e+38 is the number in the heads for the file of the Alberni Inlet that is the no data. To check what is the value for no data we just check it in the file with gdalinfo

A useful website is spatialreference.org and also proj.4 website. the Proj.4 has several python or java libraries that are particularly targeting geodesy.

 

Session 8: Datums

We figured out what datum the data are in and all the data will be resolved to a particular datum that needs to be the final one, vertically and horizontally. The Vdatum tool covers most of the coastal data in the states (not Alaska or Hawaii). You download and run the app in your computer and can process files individually or on a bash and transforms all the data to the same datum. Unfortunately it does not convert tidal vertical datums that are significantly up in rivers or channels, the transformation grid does not work that far up. Where this is not available tidal gauges info needs to be extrapolated, in that case several points with tidal info are used to create a grid, and then use that one. It is not as accurate as Vdatum but depending on the resolution of the grid, this will or not be important. Some ways to convert may be better than others and it will be more important at high resolution or on the coastline than offshore.

Areas in Alaska that are not covered in some cases we only have a single value, in that case you need to apply a single value to the entire survey or group of surveys. This will be less accurate, but will be better than nothing else. West Coast of BC will have soon continuous values. The Arctic and Atlantic will take longer. Quebec region is done. 

US and Canada have different datums and this makes it quite an issue. There are  a lot of problems trying to resolve datums in some areas, and in other areas is easy. Right now it will be in a .csar surface and it will be a living data that will have uncertainty values because some areas are not good and will just be a reference for bathy in Canada. Not clear if this will be done for the topographic initiative. There are some initiatives in Vancouver to do this for too but not clear yet. NRCan may be using geodetic but not sure. 

Vdatum allows you to do points, vector and rasters. It is a suite in a lot of surface datum. It is a great tool that solves the gap between bathy and topo. There is plenty of info in the website from Vdatum: http://vdatum.noaa.gov

There are plans to extend it to Alaska and Hawaii in a couple of years. It is an NOS tool. 

Mascot database contains more benchmarks (provincial database)

GPS-H tool (NRCan) will convert between NAD83 and other vertical datums such as the new CGVD2013

Session 9 - Bathymetric Surface

all areas of the bathymetric section needs to have a value, whether measured or interpolated. This will be done on the Linux side using GMT. The coastline acts as a mask to define the boundary between bathymetry and topography. The surface should reveal obvious errors in different datasets such as incorrect units, etc.

go to gmt.soest.hawaii.edu for more information. Go to documentation about the software and you will find detailed information about each of the commands. GMT 5 is a recent release

in Linux, navigate to the ses9 directory>surface

Open a text editor. This can be done in Linux with the command below or with any text editor: 

> nano pa_surface.sh

The following script will open


#!/bin/sh
#
# set parameters
# -- regional map ---

pa_range="-R-124.9/-124.78/48.95/49.28" ## this is the extent ##
size="-I.33333333c" ## this is the cell size ##
version="surface" 
name="port_alberni" ##editing this will change the output filename##

## Bathyemtry ##

## GMT 4 ##
#./data/pa_coast_pts3.xyz \
#./data/enc_all.xyz .\
#/data/ascii_surveys_pa_bm.xyz | \
#blockmedian $pa_range $size -V | surface $pa_range $size -G${name}.grd -V -Lu-$ ##this helps manage the data sizes##

## GMT 5 ##

cat \

./data/pa_coast_pts3.xyz \

./data/enc_all.xyz .\

/data/ascii_surveys_pa_bm.xyz | \ ##values will be made underwater where they should be with this##

gmt blockmedian $pa_range $size -V | gmt surface $pa_range $size -G${name}.grd -V -Lu-0.01 -T0 ##"T" tension can be adjusted here, as needed. This will range between 0 and 1. L is forcing any interpolated value to not be above 0.01. Cell size depends on your resolution. Usually set to one step below the final resolution. This allows for smoothing of features and reduces blockiness. This decision will vary depending on the nature of the area. For example, if there are lots of marshes and inlets vs a linear smooth entrance, we may want a finer resolution.##

## GMT 5 ##

gmt grdreformat ${name}.grd ${name}.asc=gd:AAIGrid

gdal_nan2null.py ${name}.asc -9999 ${name}.tif

# clip to coastline

clip_gdal.sh -I ${name}.tif -O ${name}_clip.tif -M ./coastline/bc_coastpoly.shp

# make mhw grid

#grdmath -N ${name}.grd ../conversion_grid/NAVD88_1s.grd SUB = ${name}_mhw.grd

#mbm_grd2arc -I${name}_mhw.grd -O${name}_mhw.asc -N-9999 -V

#sed -i 's/corner/center/g' ${name}_mhw.asc

#gdal_translate -of HFA ${name}_mhw.asc ${name}_mhw.img

#gdaladdo -ro --config USE_RRD YES ${name}_mhw.img 2 4 8 16 32

#gdalinfo ${name}_mhw.img -stats -hist

# Make hillshades

hillshade.sh -I ${name}_clip.tif -O ${name}_clip_hs

## make xyz

gdal_translate -of XYZ ${name}_clip.tif ${name}_clip.xyz

grep -v "\-9999" ${name}_clip.xyz > ${name}_clip2.xyz

mv ${name}_clip2.xyz xyz

cd xyz

RUNxyz_list bathysurface

cd ..

echo "done"

Before running the surface, ensure that all source files are formatted correctly. Headers cannot be present. Using a head command on each will help with this. All data must be in xyz format. Projections need to be consistent. For NOAA, this means WGS84 lat/long, MHW is currently used for vertical datum. This is under discussion and may shift to an ellipsoidal reference for vertical.

To run the script, edit the output filename then enter the following:

> ./

> ./pa_surface.sh

Note that to use a renamed edited script, I had to first make it executable with chmod 755

Try editing the tension, L value and presence of coastline to fine tune differences. Open the resultant Hillshade files to help evaluate

Checking derived values for accuracy

  1. Navigate to data subdirectory in ses9 folder\surface directory:
    /home/onc7dem/Projects/PortAlberni/data/DEM_workshoip_data_ml/ses9_surface/surface/data

  2. Take randomly 10% of the data out to test the accuracy of the final grid by running the following:

    > rsplit -p 10 -r ascii_surveys_pa_bm.xyz > sample10.xyz 2> ascii_surveys_pa_bm_samp.xyz      

    ##the 2> takes the remaining data and sends it to the other file##

  3. once this file is created, go into pa_surface.sh and change the input filename to match "ascii_surveys_pa_bm_samp.xyz"

Run the pa_surface.sh script again with a new output filename

 

Evaluating the Z-difference:

  1. Reformat the 10% sample data file from tab delimited to space delimited:
    > awk '{print $1,$2,$3}' sample10.xyz  > sample10b.xyz
  2. Generate a z-difference file using gdal_query.py to evaluate the z result:
    > gdal_query.py -d_format "xyzgd"  port_alberni_ml_samp_clip.tif data/sample10b.xyz > sample_diff.xyd
  3. get rid of nodata values:
    >grep -v "\-9999" sample_diff.xyd > sample_diff_b.xyd
  4. Generate histogram of differences:

    >xyz_histogram.py diffs.png sample_diff_b.xyd -p_form xyzgd -column d -statistics

  5. Resultant histogram should be a bell curve centred on 0.

  6. Can also load resultant points into ArcGIS and plot to see where the outliers reside ( ie in areas of high interpolation, etc).

Generating the DEM

  1. Rerun the pa_surface.sh script at 0.333333 arcsec resolution without any sample testing data and generate a new bathy surface.
  2. Load polygons into arcmap and define the coordinates of a bounding box around the missing topography. This will be used to reduce the coarse CDSM data used and speed up processing:

    >gmt gmtselect -R-124.8/-124.776/48.943/49.202 cdsm_dem_151124_083006B.xyz > cdsm_dem_clip.xyz

    Run the following to check the bounds:

    > gmt minmax cdsm_dem_clip.xyz

  3. cd into the xyz file and remove all test files, so that only the final xyz files remain
  4. Organize data so that processing runs faster by going to all directories with xyz files and running: 
    >RUNxyz_list related_name
    This will generate an inf file for each xyz file (MB system process) where related_name is something unique to that directory such as bag or ascii
  5. Navigate to ses10 directory and remove the existing pa.datalist file
  6. Generate data list for final gridding
    >find ../../ -type f | grep .datalist > pa.datalist

  7. Open pa.datalist in text editor to insert weighting

    >nano pa.datalist

    Following each datalist line, add to the line:  

    -1 [weight]

     Adjust the second value to be the weighting. Actual values do not really matter, just their relative weighting (somewhere between 1 and 100.

  8. Have a look at the following documentation: http://www.ldeo.columbia.edu/res/pi/MB-System/html/mbgrid.html

  9. The script pa.sh will generate the dem:

# set parameters
# -- regional map ---

range="-R-124.9/-124.78/48.95/49.28" ##this is the extent, and must be in W/E/S/N##
version="cgvd"
name="port_alberni"
data="-Ipa.datalist"
grid="$name.grd"
#
# -----------------------generate gridded bathymetry----------------------
#   9 arcsec: -E0.00250000000/0.00250000000/degrees!
#   6 arcsec: -E0.00166666667/0.00166666667/degrees!

#   3 arcsec: -E0.00083333333/0.00083333333/degrees!
#   2 arcsec: -E0.00055555556/0.00055555556/degrees!
#   1 arcsec: -E0.00027777778/0.00027777778/degrees!
#   1/3 arcsec: -Ed0.00009259259/0.00009259259/degrees!
#   1/9 arcsec: -E.000030864/.000030864/degrees!

mbgrid $data -O${name} \
        $pa_range \
        -A2 -G3 -N -C10/3 -F1 -T0 \ ##T is the tension, C specifies how much interpolation happens - fill 10 cells in every direction and 3 tells it to fill all cells##
        -E0.00009259259/0.00009259259/degrees!

## GMT 4 ##
#mbm_grd2arc -I${name}.grd -O${name}.asc -N-9999 -V
#sed -i 's/corner/center/g' ${name}.asc

## GMT 5 ##
gmt grdreformat ${name}.grd ${name}.asc=gd:AAIGrid

gdal_translate -of GTiff ${name}.asc ${name}_${version}.tif
gdaladdo -ro --config USE_RRD YES ${name}_${version}.tif 2 4 8 16 32
gdalinfo ${name}_${version}.tif -stats -hist

# Make hillshades
hillshade.sh -I ${name}_${version}.tif -O ${name}_${version}_hs

echo "done"

A program called htop will help see task monitoring. With current server memory, 2 DEMs can process at once.

To transfer files back and forth between Linux and Windows, open PSFTP from the Putty tools

Input the ip of the Linux side

>142.104.128.185

link to the windows desktop with:

>lcd c:\Users\onc7dem\Desktop

Type:

>get port_alberni_13as_cgvd.tif

The file should now be on the windows desktop

 

QAQC can include topographic benchmark comparisons, slope map, check for any nodata values,

 

Documentation

  • include problems from processing and how they were resolved
  • include where data were sourced and relevant information. Use source names rather than links as links can expire
  • include weighting system
  • any vertical datum changes
  • include suggestions for new data
  • metadata are in ISO 19115, but older ones in FGDC. Components link to component catalog. Oxygen is an xml text editor that generates the metadata. These are validated by oxygen. XSL is used for transformation
  • ISO records are ingested into many government organisations for easy searching

Useful websites

maplove.net (contains programs created by Matt)

cptcity (website with colorbars, see ETOPO in particular)

A software package called "Poseidon" contains much of the tools bundled together such as GMT, MB System, etc

osgeo.org provides open source GIS data packages 

NOAA dem catalog:

http://ngdc.noaa.gov/dem/squareCellGrid/search 

Useful software

R has a new DEM toolbox

octave is a free version of matlab

ghost sql

To generate 3D perspectives, Povray is used (open source)

 

  • No labels