Python is a very popular programming language, known for its versatility, straightforward syntax and flexibility. One of its greatest advantages, is its ability to automate repetitive, tedious tasks. For example, if you are wanting to edit the same letter numerous times to include personalised fields, e.g., a person’s name or address, Python can be used to automate this process, significantly reducing the manual burden.
At Bays, we have already published two blogs outlining the process of creating Word Documents from scratch using Python, in addtion to filling in the gaps of an already exisiting Word Document:
- Generating Word Documents Using a Template in Python
- How to create word documents (.docx) using Python
This process is quite straightforward, with lots of open-source documentation available online to help. However, when following these tutorials, a couple of things came up that seemed to be slightly more challenging at first glance:
- Adding a hyperlink to the Word Document
- Converting the Word Document to a PDF using Python
In this blog, I will cover how to do both of these additional steps.
Adding a Hyperlink
Before anything else, a Word Document template should be created.
Any information you want to add to this document should be entered in the format {{ x }}
as seen in the image below.
Here I have added {{ rt }}
and formatted it in a way I want the hyperlink to appear in the document.
For the purpose of this blog, the Python package Faker
has been used to generate fake data to fill in the missing fields.
The following cells import all of the required libraries and create the data we want to use to fill in the gaps.
# install packages
! pip install docxtpl
# importing relevant libraries
import pandas as pd
import jinja2
from win32com import client
import os
from datetime import datetime
from docxtpl import DocxTemplate, RichText
from faker import Faker
fake = Faker()
# Creating a dict that contains all information to be added to the word doc
context = {'company_name': "Bays Consulting",
'date': fake.date(),
'first_line_address': fake.address().split('\n')[0],
'second_line_address': fake.address().split('\n')[1],
'name': fake.name(),
'my_name': "Holly"}
The data created above is simply inserted into the template document in the following step (explained in more detail in the previous blog).
However, in this next step we are also adding a hyperlink to the word document by using a RichText.
jinja_env = jinja2.Environment(autoescape=True)
template = DocxTemplate("example_template.docx")
# the hyperlink we want inserted
url = "https://baysconsulting.co.uk/"
# adding hyperlink to text
rt = RichText()
rt.add(url, url_id=template.build_url_id(url))
context["rt"] = rt
# rendering document
template.render(context, jinja_env)
print(f'Finished rendering {context["name"]}')
template.save(f"{context['name']}.docx")
Finished rendering Mrs. Jacqueline Medina
Once the above code has been ran, a Word Document should be saved where all of the missing fields are now filled in..
An example of this can be seen below.
Saving to PDF
Once the Word Document has been saved, it can be converted to a PDF using the following function.
def convert_to_pdf(filepath: str):
"""Save a pdf of a docx file."""
try:
word = client.DispatchEx("Word.Application")
target_path = filepath.replace(".docx", r".pdf")
word_doc = word.Documents.Open(filepath)
word_doc.SaveAs(target_path, FileFormat=17)
word_doc.Close()
except Exception as e:
raise e
finally:
word.Quit()
convert_to_pdf(os.path.abspath(f"{context['name']}.docx"))
In conclusion, mastering the art of generating Word documents and PDFs through Python can significantly streamline your workflow, saving you valuable time and effort. By incorporating hyperlinks and seamlessly converting documents to PDFs, you enhance the functionality and accessibility of your documents. As you dive deeper into Python’s capabilities, you’ll discover endless possibilities for automation and efficiency. Stay tuned for more tutorials and tips to improve your Python skills and boost your productivity. Happy coding!