Python实战之PDF转jpg图片

/ 0评 / 0

需要实现PDF转jpg图片功能,并把转换完成的文件统一进行规则命名0001.jpg等,网上的工具大部分都是附带水印的,所以使用python开发一个小工具用于转换。

开发环境

  1. python3.8
  2. pycharm社区版
  3. pillow
  4. popller
  5. pyinstaller打包成独立exe

开发代码

#!/usr/bin/python3

# -*- coding:utf-8 -*-

"pdf2image converter"

__author__ = "mine"

import sys
sys.path.append("C:\Software\Python38\Lib\site-packages")
from pdf2image import pdf2image
import sys
import os
import time

if __name__ == '__main__':
    print(sys.argv)
    if len(sys.argv) == 2:
        # 获取运行目录
        exe_full_path = sys.argv[0]
        exe_path = os.path.dirname(exe_full_path)
        # 检测依赖库poller路径
        poppler_path = os.path.join(exe_path, "poppler")
        if os.path.exists(poppler_path):
            print(poppler_path + " check successfully")
        pdf_path = sys.argv[1]
        dir_name = os.path.basename(pdf_path).split(".")[0]
        dir_path = os.path.dirname(pdf_path)
        dir_full_path = dir_path + r"/" + dir_name
        # 生成jpg存储文件夹,名称和pdf文件名一致
        if not os.path.exists(dir_full_path):
            os.mkdir(dir_full_path)

        # 执行转换
        print("create image, please wait...")
        pdf2image.convert_from_path(
            pdf_path,
            output_folder=dir_full_path,
            dpi=300, fmt="jpg",
            thread_count=10,
            poppler_path=poppler_path,
            output_file="file")
        print("create image done, now rename")
        # 等待转换exe退出,解除文件占用
        time.sleep(5)
        # 进行文件重命名
        for file in os.listdir(dir_full_path):
            file_split = file.split("-")
            image_index = ""
            if len(file_split) == 2:
                image_index = file_split[1]
            image_index_split = image_index.split(".")
            if len(image_index_split) == 2:
                image_index = image_index_split[0]
            add_count = 4 - len(image_index)
            for i in range(0, add_count):
                image_index = "0" + image_index
            new_image_path = dir_full_path + "/" + "%s.jpg" % image_index
            os.rename(dir_full_path + "/" + file, new_image_path)
        # 转换完成
        print("work done")
    else:
        print("argv size is not 2")

总结:

pdf2image也支持C++编程,不过使用python开发更加快捷,使用pip安装需要的模块,不用进行IDE的配置,引入相关模块即可使用,能更加专注于业务的开发。