Albin O. Kuhn Library & Gallery - Staff Wiki


Using the File Combination Tool

The file combination tool does the following

  • The tool handles file types DOCX, PPTX, and PDF.

  • It combines coversheets with files and fills in the filename with the name of the main file and the name of supplementary material files.

  • It changes dates to year-mo-da format.

  • It fills in the extent.

Requirements

python, pandas, PyPDF2, docx2pdf, spire-presentation, moviepy, python-calamine.

Installation

Python

Ensure Python is installed in your system.

  1. In Windows search bar, type Command prompt or cmd.

  2. Type python -V

It will show python version. Ensure it is 3.2 or higher.

Install pip packages

In the command prompt type these commands separately or download requirements.txt- and run it and it will do all of these installations:

Pandas:

pip install pandas

PyPDF2:

pip install PyPDF2

docx2pdf:

pip install docx2pdf

spire-presentation:

pip install spire-presentation

moviepy:

pip install moviepy

python-calamine

pip install python-calamine

If requirements.txt is downloaded then copy the complete path where this file got downloaded-

pip install -r PATH/requirements.txt

How to run the tool

  1. Ensure all necessary dependencies are installed

  2. If the spreadsheet that you’re working on doesn’t already contain columns AD, Main_file, and AE, Additional_file, add those columns to it

  3. In the Main_file column, fill in the name of the coversheet followed by a comma, followed by the name of the file

  4. In the Additional_files column, fill in the names of any supplementary files

  5. Download the script:

  6. Double click on the script or type in command prompt python File_combine_tool.py

  7. Browse and select the Excel file containing the list of files to be combined.

  8. Browse and select the input folder containing the files listed in the Excel file.

  9. Browse and select the output folder where the combined files will be saved.

  10. Click the "Combine Files" button to start the file combination process.

  11. Once the process is complete, a message box will indicate successful completion, and the program will close after pressing OK.

  12. Once the process is complete, check if all the files are in the output folder.

  13. Check excel file. If file is not combined properly or some error is encountered, it will show “Please combine files manually” instead of the filename. Combine these files manually then.

  14. Check the same thing for extent. Error- “Please fill manually”. Do it manually.

Notes

  1. Ensure file names match the names written in excel sheet. (Filenames in the excel sheet should be written along with their extensions. PDF file names with or without extensions are ok)

  2. Close the excel file before running the tool.

  3. Ensure all files and coversheets are in the input folder.

After You’ve run the Tool:

  1. If the dc.format.extent column is blank, add the extent.

  1. Check the value given in the dc.date.issued column. If it’s not the YEAR-MO-DY format, put it in that format. If it’s blank, add it using the YEAR-MO-DY format.

    1. If the dc.format.extent column is blank, add the extent.

    1. Check the value given in the dc.date.issued column. If it’s not the YEAR-MO-DY format, put it in that format. If it’s blank, add it using the YEAR-MO-DY format.


Albin O. Kuhn Library & Gallery . University of Maryland, Baltimore County . 1000 Hilltop Circle . Baltimore MD 21250
(410) 455-2232. Questions and comments to: Web Services Librarian