Required for this procedure:
...
Required for this procedure:
- Notices from Proquest that files are available.
- Filezilla.
- Proquest FTP login info.
- ETD Directory on hard drive with pdf and xml subdirectories.
- 7-Zip.
- Adobe Acrobat Standard. Modify Acrobat settings: When you're in Acrobat, go to edit, then preferences. Click on "Documents" in the left-hand column. In the main part of the pop-up, under PDF/A view mode, use the drop-down to select "never."
- Computer configured to open XML files with WordPad (Right click an XML file and select "Open with" and then "Chose Program." Select WordPad, then click "Always use the selected program to open this kind of file.").
- Editix XML Editor.
- XSL file for reformatting the XML files, ETDConversionForDspace.xsl (attached here).
- Microsoft Excel with the Developer tab enabled and macros enabled (Left click on the windows symbol and select "Excel Options." On the popular tab, check "Show developer tab in the Ribbon." Go to the Trust Center Tab. Click "Trust Center Settings." Click "Enable all Macros.").
- Excel Template, ETDtempDspace.xlsm, (attached here).
- SAF Builder program (downloaded from Github and installed by LITS) and Java JDK, GIT, and Maven. Oracle VM Virtual Box for running it on Linux, and directory that can be accessed both for Linux and windows.
- Collection File program (attached here in a zip file–unzip it and put it in your ETD directory) and Python to run it.
- For converting video files to mp4's: Avidemux.
...
For other problems with the files Proquest FTP's to us, ask Michelle to call Proquest technical support at 877-408-5027 or 800-889-3358 (or email at tsupport@proquest.com or
http://support.proquest.com/ ) to find a solution.
Adding Supplements to the metadata in Excel and Moving them to the PDF Directory
...
- Delete all of the rows where extra data was filled in.
- Change the labels in row 1 as follows:
- Column B: filename__permissions:-r'Anonymous'__primary:true
- Column C: filename__permissions:-r'ScholarWorksUMBCIP'__primary:trueColumn D : filename__bundle:LICENSE__permissions:-r'Anonymous'
- Sort for all items without a license. Move their filenames to the UMBCIP column.
- For open access, add the Access Rights field states: "Distribution Rights granted to UMBC by the author."
- For closed access, add the Access Rights field states: "Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission."
- Delete column A
Make sure all rights fields have the header dcterms.accessRights .
Save your sheet2 (you must be on it) as a .csv file. While on the "save as" screen, change the character encoding to UTF8 by using the tools drop-down, selecting web options, then encoding, and UTF8.
- In the CSV file, make sure that dates are in the YYYY-MM-DD format. Find and replace to fix if necessary.
- Be sure to close Excel or the next steps won't work.
Run the SAF builder:
- Put the .csv metadata file and all of the files to be loaded in the directory in the ETD directory.
- Open Ubuntu.
- Use the command ls to list all the files in the directory, and cd to change the directory to navigate to the directory with the safbulider.sh file. Use the cd command alone to go up a level in the directory. Remember directory names, file names and commands are all case sensitive.
- Run the safbuilder by typing "sudo ./safbuilder.sh -c etd/path to metadata file." For example, "sudo ./safbuilder.sh -c etd/Oct2019etds/PDFs/ETDtempDspace_Oct2019Load.csv"would run the safbuilder on the metadata.csv file and all of the files in the directory with it. Note that the etd in the path must be lower case despite that it's upper case in windows. You can use the up arrow to cycle through previous commands so that you don't have to retype. When you push enter to run the command, you'll be prompted to enter your password.
- The program will make a bunch of text appear in the DOS window. If it doesn't, the program didn't run. You probably made a typo when you typed the run command in. Try again, and be sure to type it all correctly.
- When it's run correctly, in DOS window, the last line should indicate that ETDtempDspace.csv has been used 0 times, and that should be the only line with a "File:" error See below:
A SimpleArchiveFormat directory should appear in your folder that the files and the csv file are in. - If there is more than the one "File" error, there is something wrong. See below:
These errors happen when the files in the folder and filenames in the csv file don't match. Determine if there is a problem that needs to be corrected by comparing your .csv file to the contents of your directory. If necessary, make the corrections, then delete your SimpleArchiveFormat directory, and run the safbuilder again. If you can't fix the problems, or don't know what's causing them, ask Michelle for help. If she's not there, you can copy and paste all the errors to Word by pushing the PrtScn and Ctrl keys together to copy your screen to the clipboard, and paste your screen into Word--if there are many errors, scroll through them getting them all pasted into Word. - If other errors occur, it's usually because of a typo in the command/path. Try to run it again.
Run the program to create the collection files:
- Check your CollectionFilesProgram directory for any old SAF directories and delete them.
- Move the entire current SimpleArchiveFormat directory into your CollectionFilesProgram directory.
- Open Ubuntu and zip the CollectionFilesProgram directory by:
- navigating to the directory that the CollectioFilesProgram directory is in (probably the ETD folder), using ls to the contents of the directory you' in and cd to change directories.
- Zip the CollectionFilesProgram directory using the command zip -r myfilename.zip CollectionFilesProgram/
- Email your zip of the CollectionFilesProgram directory to yourself.
- In your browser, go to https:/elum.in/umbc-facstaff and log in.
- Log into myumbc on elumin and download yourzip file.
- Navigate to the zip file in File Explorer then unzip it using extract.
- Open the command line dos prompt by typing cmd into start.
- Navigate to the unzipped CollectionFilesProgram directory in downloads.
- Run the program by typing "python safscript.py"
- Look at the log, saf_log.txt for any items skipped. If items have been skipped fix them, or ask Michelle to fix them, and rerun the SAFbuilder and re-do all steps after it. If no items have been skipped, zip the SAF directory.
- Send the zipped SAF directory to MD-SOAR help requesting that they load it.
Zip the SimpleArchiveFormat Directory
- Open Oracle VM Virtual Box and double click on Ubuntu.
- If Ubuntu goes to a red screen showing the time and date, it's locked. Drag the screen up to reveal the login screen to unlock it. The password is linux.
- Click on activities then type term. Double click the terminal icon.
- Change the directory to the Simple Archive Format directory.
- Run zip: zip -4 outputfile PathInputDirectory. The output file should have the extension .zip
Email the zipped files to DSS at CP:
...
- permissions:-r'ScholarWorksUMBCIP'__primary:true
- Column D : filename__bundle:LICENSE__permissions:-r'Anonymous'
- Sort for all items without a license. Move their filenames to the UMBCIP column.
- For open access, add the Access Rights field states: "Distribution Rights granted to UMBC by the author."
- For closed access, add the Access Rights field states: "Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission."
- Delete column A
Make sure all rights fields have the header dcterms.accessRights .
Save your sheet2 (you must be on it) as a .csv file. While on the "save as" screen, change the character encoding to UTF8 by using the tools drop-down, selecting web options, then encoding, and UTF8.
- In the CSV file, make sure that dates are in the YYYY-MM-DD format. Find and replace to fix if necessary.
- Be sure to close Excel or the next steps won't work.
Run the SAF builder:
- Put the .csv metadata file and all of the files to be loaded in the directory in the ETD directory.
- Open Ubuntu.
- Use the command ls to list all the files in the directory, and cd to change the directory to navigate to the directory with the safbulider.sh file. Use the cd command alone to go up a level in the directory. Remember directory names, file names and commands are all case sensitive.
- Run the safbuilder by typing "sudo ./safbuilder.sh -c etd/path to metadata file." For example, "sudo ./safbuilder.sh -c etd/Oct2019etds/PDFs/ETDtempDspace_Oct2019Load.csv"would run the safbuilder on the metadata.csv file and all of the files in the directory with it. Note that the etd in the path must be lower case despite that it's upper case in windows. You can use the up arrow to cycle through previous commands so that you don't have to retype. When you push enter to run the command, you'll be prompted to enter your password.
- The program will make a bunch of text appear in the DOS window. If it doesn't, the program didn't run. You probably made a typo when you typed the run command in. Try again, and be sure to type it all correctly.
- When it's run correctly, in DOS window, the last line should indicate that ETDtempDspace.csv has been used 0 times, and that should be the only line with a "File:" error See below:
A SimpleArchiveFormat directory should appear in your folder that the files and the csv file are in. - If there is more than the one "File" error, there is something wrong. See below:
These errors happen when the files in the folder and filenames in the csv file don't match. Determine if there is a problem that needs to be corrected by comparing your .csv file to the contents of your directory. If necessary, make the corrections, then delete your SimpleArchiveFormat directory, and run the safbuilder again. If you can't fix the problems, or don't know what's causing them, ask Michelle for help. If she's not there, you can copy and paste all the errors to Word by pushing the PrtScn and Ctrl keys together to copy your screen to the clipboard, and paste your screen into Word--if there are many errors, scroll through them getting them all pasted into Word. - If other errors occur, it's usually because of a typo in the command/path. Try to run it again.
Run the program to create the collection files:
- Check your CollectionFilesProgram directory for any old SAF directories and delete them.
- Move the entire current SimpleArchiveFormat directory into your CollectionFilesProgram directory.
- Open Ubuntu and zip the CollectionFilesProgram directory by:
- navigating to the directory that the CollectioFilesProgram directory is in (probably the ETD folder), using ls to the contents of the directory you' in and cd to change directories.
- Zip the CollectionFilesProgram directory using the command zip -r myfilename.zip CollectionFilesProgram/
- Email your zip of the CollectionFilesProgram directory to yourself.
- In your browser, go to https:/elum.in/umbc-facstaff and log in.
- Log into myumbc on elumin and download yourzip file.
- Navigate to the zip file in File Explorer then unzip it using extract.
- Open the command line dos prompt by typing cmd into start.
- Navigate to the unzipped CollectionFilesProgram directory in downloads.
- Run the program by typing "python safscript.py"
- Look at the log, saf_log.txt for any items skipped. If items have been skipped fix them, or ask Michelle to fix them, and rerun the SAFbuilder and re-do all steps after it. If no items have been skipped, zip the SAF directory.
- Send the zipped SAF directory to MD-SOAR help requesting that they load it.