The file combination tool does the following
The tool handles file types DOCX, PPTX, and PDF.
It combines coversheets with files and fills in the filename with the name of the main file and the name of supplementary material files.
It changes dates to year-mo-da format.
It fills in the extent.
Requirements
python, pandas, PyPDF2, docx2pdf, spire-presentation, moviepy, python-calamine.
Installation
...
File Combination Tool
This tool is designed to combine various types of files (such as PDFs, DOCX, PPTX) into a single PDF file. It also handles conversion of DOCX and PPTX files to PDF format.
Requirements
Python 3.x
tkinter
pandas
PyPDF2
docx2pdf
spire.presentation
moviepy
Following steps are to be done before using this tool for the first time only-
Python-
Ensure Python is installed in your system.
In Windows search bar, type Command prompt or cmd.
Type python -V
It will show python version. Ensure it is 3.2 or higher.
Install pip packages-
In the command prompt type these commands separately or download requirements.txt- and run it and it will do all of these installations:
Pandas:
pip install pandas
PyPDF2:
pip install PyPDF2
docx2pdf:
pip install docx2pdf
spire-presentation:
pip install spire-presentation
moviepy:
pip install moviepy
python-calamine
pip install python-calamine
If requirements.txt is downloaded then copy the complete path where this file got downloaded-
pip install -r PATH/requirements.txt
...
Usage Instructions
Ensure all necessary dependencies are installed.
If the spreadsheet that you’re working on doesn’t already contain columns
...
AE, Main_file, and
...
AF, Additional_
...
files, add those columns to it
In the Main_file column, fill in the name of the coversheet followed by a comma, followed by the name of the file
In the Additional_files column, fill in the
...
name of any supplementary
...
file (If there are more than 1 then zip them and add zip file name here)
Download the script:
...
Download the File_combine_tool.py
Ensure the excel file and all main/supplementary files are closed before running the script.
Run the Script:
· Double-click the File_combine_
...
tool.py file to execute the script.
· Alternatively, run the script from the command line using the following command:
python File_combine_tool.py
Browse and Select Files:
Browse and select the Excel file containing the list of files to be combined.
Browse and select the input folder containing the files listed in the Excel file.
Browse and select the output folder where the combined files will be saved.
...
Execute Script:
After selecting the Excel files, click the " Combine Files " button
...
.
Script Completion:
Once the process is complete, a message box will indicate successful completion, and the program will close after pressing OK
...
Once the process is complete, check if all the files are in the output folder.
Check excel file. If file is not combined properly or some error is encountered, it will show “Please combine files manually” instead of the filename
...
.
Check the same thing for extent. Error- “Please fill manually”.
...
View Logs:
Logs for the script execution are stored in a file named file_combine.log which is in the same folder as the script.
This log file provides information about any errors encountered during the file processing such as file not found in the input folder, error while opening a file, error while converting/combining the files etc.
Work on these errors and try to resolve them. Once resolved, you can work on these files manually or run the script again.
Notes
Ensure file names match the names written in excel sheet. (Filenames in the excel sheet should be written along with their extensions. PDF file names with or without extensions are ok)
Close the excel file before running the tool.
Ensure all files and coversheets are in the input folder.
After You’ve run the Tool:
If the dc.format.extent column is blank, add the extent.
Check the value given in the dc.date.issued column. If it’s not the YEAR-MO-DY format, put it in that format. If it’s blank, add it using the YEAR-MO-DY format.
If the dc.format.extent column is blank, add the extent.
...
If any 1 of the files (coversheet, main file or additional file) given in spreadsheet is missing from the input folder, “Please combine files manually” error will show in the spreadsheet. Due to which file this error is coming, can be seen in the logs.
If the file is there in the input folder but still “file not found” error is coming in the logs, it means there is mismatch in the file name.