Combine all parts of the load into 1 Excel file and 1 directory with all of the files.
Search the date issued column for an / to ensure all dates are and then sort by date to check for any that aren’t in the correct format and fix.
Find and replace all instance of || (blank space ||) with ||
Find and replace all instance of || (|| blank space) with ||
Find and fix any single |
Find and fix any |||
Search the filename column for 1 blank space to ensure there are no spaces in the file names.
Search for n/a in the entire spreadsheet and delete it where it’s been filled in.
Search for any empty cells in the relation.IsPartOf column and if there are any, complete them.
Check that Faculty Collection and Student Collections are correct.
Check for duplicate values by select the title column, then going to conditional formatting and selecting duplicate values, click ok, then go to filter, and filter by color.
Create a new sheet in the Esudxcel Excel file and copy and paste the columns to be loaded into it and save it.
While on the Excel sheet with the files to be loaded, save as, and change the file type to comma delimited csv and in the tools go to web options and ensure that the encoding is set to Unicode (UTF-8), then save.
Be sure the csv is closed in all programs.
Put the .csv metadata file and all of the files to be loaded in the Zotload directory. (The directory should include all of the files that compose a work, including supplements, and a csv file with metadata in it. The directory must be in the directory mapped to Linux)
Open Ubuntu.
Use the command ls to list all the files in the directory, and cd to change the directory to navigate to the directory with the safbulider.sh file. Use the cd command alone to go up a level in the directory. Remember directory names, file names and commands are all case sensitive.
Run the safbuilder by typing "sudo ./safbuilder.sh -c etd/path to metadata file." For example, "sudo ./safbuilder.sh -c etd/Oct2019etds/PDFs/ETDtempDspace_Oct2019Load.csv"would run the safbuilder on the metadata.csv file and all of the files in the directory with it.
...
The program will make a bunch of text appear in the DOS window. If it doesn't, the program didn't run. You probably made a typo when you typed the run command in. Try again, and be sure to type it all correctly. When the program successfully runs, it creates a SimpleArchiveFormat directory within the directory that you ran it on. The SimpleArchiveFormat contains numbered subdirectories: Item_000/, Item_001/, Item_0002, etc. Each of those subdirectories should contain a dublin_core.xml file, a contents file, and all the files that consist of the work described in the metadata.
When it's run correctly, in DOS window, the last line should indicate that ETDtempDspace.csv has been used 0 times, and that should be the only line with a "File:" error See below:
A SimpleArchiveFormat directory should appear in your folder that the files and the csv file are in.
If there is more than the one "File" error, there is something wrong.
Errors happen when the files in the folder and filenames in the csv file don't match. Determine if there is a problem that needs to be corrected by comparing your .csv file to the contents of your directory. If necessary, make the corrections, then delete your SimpleArchiveFormat directory, and run the safbuilder again. If you can't fix the problems, or don't know what's causing them, ask Michelle for help. If she's not there, you can copy and paste all the errors to Word by pushing the PrtScn and Ctrl keys together to copy your screen to the clipboard, and paste your screen into Word--if there are many errors, scroll through them getting them all pasted into Word.
If other errors occur, it's usually because of a typo in the command/path. Try to run it again.
Run the Collection Mapping Program
Installation instructions are here: User Documentation for Mapping Script
The program is here:
View file | ||
---|---|---|
|
Run the Script:
Double-click the mapping_script.py file to execute the script.
Alternatively, run the script from the command line using the following command:
python Mapping_file_creation.py
Browse and Select Files:
Click on the "Browse" button next to "Folder Path" to select the folder containing metadata files. (SimpleArchiveFormat folder)
Click on the "Browse" button next to "Excel File" to select the Collections Excel file.
Execute Script:
After selecting the folder and Excel file, click on the "Run Script" button.
The script will process the metadata files, map the collections, and generate collections files for each item.
View Logs:
Logs for the script execution are stored in a file named mapping.log which is in the same folder where script file is stored.
This log file provides information about any errors encountered during the process.
Script Completion:
Once the script completes execution, a message box will appear indicating the successful creation of files or any errors encountered.
Zip the Final SimpleArchiveFormat directory and send it Tech Support
Zip the final SimpleArchiveFormat directory by:
navigating to the directory that the final SimpleArchiveFormat directory is in, using ls to view the contents of the directory you're in and cd to change directories.
Zip the SimpleArchiveFormat directory using the command zip -r myfilename.zip SimpleArchiveFormat/
Send the zipped SAF directory to MD-SOAR help, mdsoar-help@umd.edu, Joseph Andrew Koivisto <jkoivist@umd.edu> and Nima Asadi <nasadi1@umd.edu> requesting that they load it, telling them what collection to load it into. You can attach it and it will automatically be added to Google Drive and you’ll be prompted to give mdsoar-help access when you click send. Give them edit access.
Create a subtask on the load that says “waiting for tech support to load” and attach your final combined spreadsheet there.