Recently I found a new method using a bash script I put together. It's one of two scripts where the second uses ogr2ogr and batch merges the CanVec data into shapefiles.
The batch download script takes a list of NTS sheets in CSV format, makes you a workspace and begins downloading the CanVec data (or Geobase DEMs) and I'll cover the method I use to format these from a DBF to get them ready for this process.
It basically involves stripping the DBF of its header and reducing it to a single column with no special characters.
The formatting is all done in OpenOffice.org Base.
sudo apt-get install openoffice.org-base
Apparently Ubuntu comes with an Open Source Java that seems to do the trick, but OpenOffice.org, unlike LibreOffice asks strictly for the Java RTE but still works with the OS Java.
Once that was all installed I could then open my DBF with Base from the terminal:
It immediately asked me for the Character Set.
This did work by choosing the default but I still chose UTF-8 because it without a doubt supports the characters in this 50k NTS list.
ooBase default Charset highlighted.
If the SHP came with a .cpg file, the character encoding will be known in the DBF
When the DBF opened it contained two columns and a header for the column names:
DBF file in OpenOffice Base showing 50k NTS Sheets
I removed the second column [B] and the first row  leaving only a list of 50k NTS sheets:
I right clicked column [B] or row  and selected Delete Rows/Columns, not Delete Contents.
It was then time to go to File > Save As...
Look at the bottom of the Save window that comes up and change the file type to CSV using either the pull-down menu or the expandable menu (+).
After I gave it a new filename and placed it in a folder of my choice I clicked Save/OK and this window popped up:
I selected "Keep Current Format" to continue.
After I pressed the "Keep Current Format" button I was then able to adjust the CSV export settings like the 'delimiters' and character set (which I already specified at the beginning so no need to change here):
Default CSV export settings dialog.
I removed the Text and Field delimiters because those will be considered invalid characters in the script and will cause it to bail then pressed OK.
Field and Text delimiters removed.
To verify that I did everything correctly I checked the contents of the CSV file with the terminal built-ins 'CAT' and 'HEAD':
Using Ubuntu terminal built-ins to check a portion of the CSV file contents.
It asks you for some information before beginning:
(a) Your e-mail?; used as the password and the user-name is 'anonymous'.
(b) Where your CSV file is saved?; the file is checked for errors (minimal).
(*b) The CSV file can be either a list of 250k or 50k sheets, it doesn't matter.
(c) A workspace to save the zip files?; if it doesn't exist it will be created.
(d) What dataset?; Options currently include CanVec 50k or Geobase DEMs 50/250k.
The script itself is not entirely finished because I am adding some 'feedback' for when it finishes downloading like total download time and file size/count.
The script will still get me my zip files and place them nicely in a folder for me.
I save my scripts in a folder called 'script' in my HOME directory.
Be sure to set the permissions (make it an executable) prior to running it or it won't execute:
NOTE: The script will create a .wgetrc file in your home folder to hold the username and password overwriting yours if it exists!
I ought to mention that I have never made a shell script before and everything I learned is from these online resources:
- Advanced Bash-Scripting Guide (pdf and html)
- The Linux Command Line (pdf)
- The Linux Cookbook:Tips and Techniques for Everyday Use (html)
It took me under half a day to put this script together but nearly a month to cover the required material... I am thinking it is worth it as this is a task I repeatedly do and this will streamline that process.
Next is the CanVec_shp script which batch merges the CanVec GIS data into separate shapefiles using ogr2ogr from the GDAL.
I ran this formatted table (as a CSV) through the script and it finished saying:
So I ran the script again but this time adding "--no-clobber" to the wget portion so that any files would NOT be overwritten.
If they existed in my workspace, they would be skipped:
File `canvec_103j11_shp.zip' already there; not retrieving.
--2011-07-16 09:28:26-- ftp://ftp2.cits.rncan.gc.ca/pub/canvec/50k_shp/103/j/canvec_103j12_shp.zip
Resolving ftp2.cits.rncan.gc.ca... 22.214.171.124
Connecting to ftp2.cits.rncan.gc.ca|126.96.36.199|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /pub/canvec/50k_shp/103/j ... done.
==> SIZE canvec_103j12_shp.zip ... done.
==> PASV ... done. ==> RETR canvec_103j12_shp.zip ...
No such file
File `canvec_103j15_shp.zip' already there; not retrieving.
File `canvec_103j16_shp.zip' already there; not retrieving.
I headed over to the CanVec 8th Edition Datasets list (in txt format) and did a 'find' (Ctrl+F) for that specific sheet; nothing was returned - sure enough it doesn't exist!
Missing NTS sheet for CanVec download is highlighted in Yellow.