How to create word documents (.docx) using Python

This blog will show you how to create word documents (.docx) using Python. This can be very useful, especially in the generation of reports. I’ll be using below report as an example.

The library that will be used to do this in this blog is docx v0.8.11.

In [1]:

# !pip install python-docx=0.8.11

Libraries:

In [2]:

import pandas as pd, numpy as np
import matplotlib.pyplot as plt

# docx
from docx import Document
from docx.shared import Inches

In [3]:

import warnings
warnings.filterwarnings("ignore")

Using Python, we can create MS Word documents (.docx) containing headings, paragraphs, tables, pictures, amongst others. Below script shows some of the mainstream functionalities that one might want to do in a MS Word document.

More info can be found on the documentation https://python-docx.readthedocs.io/en/latest/.

In [4]:

# the document is assigned to 'document'
document = Document()

# add headings
document.add_heading('Heading 0', level=0)
document.add_heading('Heading 1', level=1)
document.add_heading('Heading 2', level=2)
# f-strings can be used as well
heading_size = 3
document.add_heading(f'Heading {heading_size}', heading_size)  

# doing an empty '.add_paragraph' will act as a newline
p = document.add_paragraph()

# otherwise, texts can be entered in the brackets. note that each paragraph is assigned to a different name. 
# it helps in differentiating and customising them
para_0 = document.add_paragraph('This is how we input paragraphs in the document. This paragraph has been declared as 
variable para_0.')

# adding text and setting some of the text to bold and italics by using '.add_run'
para_1 = document.add_paragraph('This is another paragraph para_1. By using the variables that they are assigned to, 
we can manipulate the texts such as ')
para_1.add_run('setting text to bold, ').bold = True
para_1.add_run('or if you prefer italics. ').italic = True
para_1.add_run('Notice that they all form part of the same paragraph and are on the same line.')

# adding an unordered list
document.add_paragraph('First item in unordered list.', style='List Bullet')
document.add_paragraph('Second item in unordered list.', style='List Bullet')

# adding an ordered list
document.add_paragraph('First item in ordered list.', style='List Number')
document.add_paragraph('Second item in ordered list.', style='List Number')

# adding a picture and customising the size/width
# note: the picture must be saved locally
document.add_picture('maths.jpg', width=Inches(2.5))

# adding a table using a tuple of tuples
# can be a dataframe as well instead of a tuple - needs to be iterable
# first describe the data
records = (
            ('FR0101', 'French', 73),
            ('MA1019', 'Mathematics', 97),
            ('EN0631', 'English', 84)
    )

# create a table with 3 columns
# only 1 row has been created for the header
# add we iterate through the data, we will add rows accordingly
table = document.add_table(rows=1, cols=3)
# naming the header cells
header_cells = table.rows[0].cells
header_cells[0].text = 'Subject Code'
header_cells[1].text = 'Subject'
header_cells[2].text = 'Grades'
# iterate through the data
# add a new row
# assign data points to the cells as required
for subject_code, subject, grade in records:
    row_cells = table.add_row().cells
    row_cells[0].text = subject_code
    row_cells[1].text = subject
    row_cells[2].text = str(grade)

# save as a .docx document
document.save('testing_basic_document.docx')

The above script shows how to do some of the main functionalities of the python-docx library.

Now let’s get a dataset and do something more fun (and probably more useful).

Getting the raw data: (Dataset obtained from Kaggle. Link: https://www.kaggle.com/datasets/rohitsahoo/sales-forecasting)

In [5]:

raw_data = pd.read_csv('superstore_sales.csv', parse_dates=['Order Date', 'Ship Date'])
raw_data.head()

Out[5]:

	Row ID	Order ID	Order Date	Ship Date	Ship Mode	Customer ID	Customer Name	Segment	Country	City	State	Postal Code	Region	Product ID	Category	Sub-Category	Product Name	Sales
0	1	CA-2017-152156	2017-08-11	2017-11-11	Second Class	CG-12520	Claire Gute	Consumer	United States	Henderson	Kentucky	42420.0	South	FUR-BO-10001798	Furniture	Bookcases	Bush Somerset Collection Bookcase	261.9600
1	2	CA-2017-152156	2017-08-11	2017-11-11	Second Class	CG-12520	Claire Gute	Consumer	United States	Henderson	Kentucky	42420.0	South	FUR-CH-10000454	Furniture	Chairs	Hon Deluxe Fabric Upholstered Stacking Chairs,…	731.9400
2	3	CA-2017-138688	2017-12-06	2017-06-16	Second Class	DV-13045	Darrin Van Huff	Corporate	United States	Los Angeles	California	90036.0	West	OFF-LA-10000240	Office Supplies	Labels	Self-Adhesive Address Labels for Typewriters b…	14.6200
3	4	US-2016-108966	2016-11-10	2016-10-18	Standard Class	SO-20335	Sean O’Donnell	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	FUR-TA-10000577	Furniture	Tables	Bretford CR4500 Series Slim Rectangular Table	957.5775
4	5	US-2016-108966	2016-11-10	2016-10-18	Standard Class	SO-20335	Sean O’Donnell	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	OFF-ST-10000760	Office Supplies	Storage	Eldon Fold ‘N Roll Cart System	22.3680

Only a subset of the dataset will be used in this blog:

In [6]:

data = raw_data[['Order Date', 
                 'Ship Date', 
                 'Ship Mode', 
                 'Segment', 
                 'Country', 
                 'City', 
                 'State',
                 'Postal Code', 
                 'Region', 
                 'Category', 
                 'Sub-Category', 
                 'Sales']]
data.head()

Out[6]:

	Order Date	Ship Date	Ship Mode	Segment	Country	City	State	Postal Code	Region	Category	Sub-Category	Sales
0	2017-08-11	2017-11-11	Second Class	Consumer	United States	Henderson	Kentucky	42420.0	South	Furniture	Bookcases	261.9600
1	2017-08-11	2017-11-11	Second Class	Consumer	United States	Henderson	Kentucky	42420.0	South	Furniture	Chairs	731.9400
2	2017-12-06	2017-06-16	Second Class	Corporate	United States	Los Angeles	California	90036.0	West	Office Supplies	Labels	14.6200
3	2016-11-10	2016-10-18	Standard Class	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	Furniture	Tables	957.5775
4	2016-11-10	2016-10-18	Standard Class	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	Office Supplies	Storage	22.3680

In [7]:

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9800 entries, 0 to 9799
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   Order Date    9800 non-null   datetime64[ns]
 1   Ship Date     9800 non-null   datetime64[ns]
 2   Ship Mode     9800 non-null   object        
 3   Segment       9800 non-null   object        
 4   Country       9800 non-null   object        
 5   City          9800 non-null   object        
 6   State         9800 non-null   object        
 7   Postal Code   9789 non-null   float64       
 8   Region        9800 non-null   object        
 9   Category      9800 non-null   object        
 10  Sub-Category  9800 non-null   object        
 11  Sales         9800 non-null   float64       
dtypes: datetime64[ns](2), float64(2), object(8)
memory usage: 918.9+ KB

The data ranges from 2015 to 2018. Data of the years 2016 and 2017 from the Order Date column will be used. Below two dataframes are the subsets for the years 2016 and 2017 respectively:

In [8]:

mask_2016 = (data['Order Date'] > '2016-01-01') & (data['Order Date'] <= '2016-12-30')
data_2016 = data.loc[mask_2016]

data_2016.head()

Out[8]:

	Order Date	Ship Date	Ship Mode	Segment	Country	City	State	Postal Code	Region	Category	Sub-Category	Sales
3	2016-11-10	2016-10-18	Standard Class	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	Furniture	Tables	957.5775
4	2016-11-10	2016-10-18	Standard Class	Consumer	United States	Fort Lauderdale	Florida	33311.0	South	Office Supplies	Storage	22.3680
14	2016-11-22	2016-11-26	Standard Class	Home Office	United States	Fort Worth	Texas	76106.0	Central	Office Supplies	Appliances	68.8100
15	2016-11-22	2016-11-26	Standard Class	Home Office	United States	Fort Worth	Texas	76106.0	Central	Office Supplies	Binders	2.5440
24	2016-09-25	2016-09-30	Standard Class	Consumer	United States	Orem	Utah	84057.0	West	Furniture	Tables	1044.6300

In [9]:

mask_2017 = (data['Order Date'] > '2017-01-01') & (data['Order Date'] <= '2017-12-30')
data_2017 = data.loc[mask_2017]

data_2017.head()

Out[9]:

	Order Date	Ship Date	Ship Mode	Segment	Country	City	State	Postal Code	Region	Category	Sub-Category	Sales
0	2017-08-11	2017-11-11	Second Class	Consumer	United States	Henderson	Kentucky	42420.0	South	Furniture	Bookcases	261.960
1	2017-08-11	2017-11-11	Second Class	Consumer	United States	Henderson	Kentucky	42420.0	South	Furniture	Chairs	731.940
2	2017-12-06	2017-06-16	Second Class	Corporate	United States	Los Angeles	California	90036.0	West	Office Supplies	Labels	14.620
13	2017-05-12	2017-10-12	Standard Class	Consumer	United States	Seattle	Washington	98103.0	West	Office Supplies	Binders	407.976
21	2017-09-12	2017-12-13	Standard Class	Corporate	United States	Fremont	Nebraska	68025.0	Central	Office Supplies	Art	19.460

Let’s pretend you work at a company and have to generate yearly reports from the sales dataset. The year 2017 will be addressed as current_year. The data for the previous year (2016) will be only used for comparisons.

The plan is to generate a report which includes the following metrics:

Total sales + Comparison of Total sales with last year (text)
Monthly sales + Percentage increase/decrease compared to the same month of last year (table)
Segment (pie chart)
Ship Mode (pie chart)
Category (bar chart)

In [10]:

# giving names to the things that will be used to avoid hardcoding
current_year = 2017
current_year_data = data_2017
previous_year_data = data_2016

To begin with, let’s create our document. And while we are at it, let’s add a heading as well.

In [11]:

# creating the document
report_doc = Document()
# adding a heading of level=0
report_doc.add_heading(f'Yearly report {current_year}', level=0)

Out[11]:

<docx.text.paragraph.Paragraph at 0x7f244c4861c0>

In an ideal world, we would have all the functions ready, and then make the script to build up the MS Word document. But here I will make the functions as we proceed

In [12]:

# add a short intro paragraph
p_intro = report_doc.add_paragraph(f'This is the yearly report for the year {current_year}. We will analyse different metrics 
for sales, customer attributes, shipping modes and types of products.')

With that being done, let’s jump into the sales:

We need the total sales for the current and previous years.
We need the monthly sales for the current and previous years.

1. Total sale

Note: I’ll make use of functions as much as I can to keep things general and reproducible.

In [13]:

def get_yearly_sales(year_df: pd.DataFrame, col: str = 'Sales') -> float:
    """
    This function returns the sum of the sales rounded to 2 decimal places as a float.
    """
    return year_df[col].sum().round(2)

In [14]:

# use function to get total sales values
current_total_sales = get_yearly_sales(current_year_data)
previous_total_sales = get_yearly_sales(previous_year_data)
# calculate the percentage change in the total sales
percentage_change_total_sales = ((current_total_sales - previous_total_sales) / previous_total_sales * 100).round(1)

In [15]:

# let's spice it up a bit
if percentage_change_total_sales > 0:
    percentage_change_text = f"there has been an increase of {percentage_change_total_sales}% in total sales."
elif percentage_change_total_sales < 0:
    percentage_change_text = f"there has been an decrease of {abs(percentage_change_total_sales)}% in total sales."
else:
    percentage_change_text = f"there has been no change in the total sales."

Adding the total sales section (text) to the document:

In [16]:

# add a subheading
report_doc.add_heading('Sales:', level=1)

# add paragraph for the total sales
p_yearly_sales = report_doc.add_paragraph(f'The total sales for the year {current_year} is ')
p_yearly_sales.add_run(f"$ {current_total_sales}").bold = True
p_yearly_sales.add_run(f'. Compared to the year {current_year-1}, ')           
p_yearly_sales.add_run(percentage_change_text)  # this part of the text will depend on the conditions set above

Out[16]:

<docx.text.run.Run at 0x7f244c492df0>

2. Monthly sales:

Remember, for the monthly sales, we want it as a table. One way of making a table is using tuples as shown in the first script. Another way of doing it is by first creating a dataframe of what we want as a table. Then build a general function which converts any dataframe into docx tables. I’ll go with the latter.

2.1. Get monthly sales df:

In [17]:

def get_monthly_sales(year_df: pd.DataFrame, date_col: str = 'Order Date') -> pd.DataFrame:
    """
    This function returns a dataframe with the monthly sales rounded to 2 decimal places.
    """
    # group the date_col column by month and sum the values
    # filter out the 'Sales' column and round it up to 2 decimal places
    monthly_sales = year_df.groupby(pd.Grouper(key=date_col, freq='M')).sum()[['Sales']].round(2)
    
    # change the monthly dates into their corresponding month name 
    monthly_sales.index = monthly_sales.index.month_name()
    
    # reset index: this is important because ms word tables do not take df indices into account
    monthly_sales = monthly_sales.reset_index()
    
    # some aesthetics
    monthly_sales = monthly_sales.rename(columns={'Order Date': 'Month'})

    return monthly_sales

In [18]:

# use function to get the monthly sales
current_monthly_sales_df = get_monthly_sales(current_year_data)
previous_monthly_sales_df = get_monthly_sales(previous_year_data)
# add an extra column in 'current_monthly_sales_df' to show the percentage difference (rounded to 1dp) when compared to last year
current_monthly_sales_df['Percentage difference'] = ((current_monthly_sales_df['Sales'] - previous_monthly_sales_df['Sales']) / 
previous_monthly_sales_df['Sales'] * 100).round(1)
# add '%' sign after the percentages
# add '$' sign before the monthly sales values
current_monthly_sales_df['Percentage difference'] = current_monthly_sales_df['Percentage difference'].astype(str) + '%'
current_monthly_sales_df['Sales'] = '$' + current_monthly_sales_df['Sales'].astype(str)

2.2. Function to convert a df to a MS Word table:

In [19]:

def make_rows_bold(row):
    """
    This function sets all the cells in the 'row' to bold. 
    To be used to set the header row to bold.
    """
    for cell in row.cells:
        for paragraph in cell.paragraphs:
            for run in paragraph.runs:
                run.font.bold = True

In [20]:

def add_table_to_word_doc(input_df: pd.DataFrame, document):
    """
    This function converts the the input_df into a table, adds it to the input document, and then returns the document.
    """
    # this create a table of the size of the df
    table = document.add_table(input_df.shape[0]+1, input_df.shape[1])
    
    # convert the column names of the dataframe to column titles of the ms word table
    for col in range(input_df.shape[-1]):
        table.cell(0, col).text = input_df.columns[col]

    # add the rest of the dataframe to the table
    for row in range(input_df.shape[0]):
        for col in range(input_df.shape[-1]):
            table.cell(row+1, col).text = str(input_df.values[row, col])
    
    # style of the table ('Table Grid' is just the normal table)
    table.style = 'Table Grid'
    
    # set the title row to bold
    make_rows_bold(table.rows[0])

    return document

Now that we have got everything sorted, let’s add the monthly sales table to the document:

In [21]:

# add paragraph for monthly sales
p_monthly_sales = report_doc.add_paragraph(f'The table below shows the monthly sales for the year {current_year} and the percentage 
change compared to last year.')

# to add the table to the document, simply run the function
report_doc = add_table_to_word_doc(current_monthly_sales_df, report_doc)

# leave a space after table
p_break = report_doc.add_paragraph()

3. and 4. Now that sales is done, we need to add the pie charts for the `Segment` and `Ship Mode`. We need to make functions to aggregate them, and to make pie charts as well.

Function to get the aggregate:

In [22]:

def get_aggregate(input_df: pd.DataFrame, col_name: str):
    """
    This function groups the different values in the 'col_name' column and returns the counts for each of them.
    """
    return input_df[col_name].value_counts(dropna=False)

In [23]:

# getting the aggregates
current_year_segment_agg = get_aggregate(current_year_data, 'Segment')
current_year_ship_mode_agg = get_aggregate(current_year_data, 'Ship Mode')

Function to make and save the pie chart locally:

In [24]:

def make_pie_chart_and_save(input_series: pd.Series, picture_filename: str, legend_title: str, my_colours: list = None):
    # note: there are 3 ways to describe the colours
    # but I'll just use the defaults in this case
    # you can add a list of colours as to fourth parameter to customise it
#     my_colours = ['orange', '#6F95EB', (121/264, 216/264, 21/264)]

    # using the index as the labels 
    labels = input_series.index
    # making the pie chart
    pie = plt.pie(input_series.values, labels=labels, autopct='%1.1f%%', startangle=90, colors=my_colours)
    # customising the legend + moving it so that it does not block the pie chart
    plt.legend(pie[0], labels, title=legend_title + ':', bbox_to_anchor=(-0.3,1.0), loc='upper left', fontsize=10)
    # saving the pie chart locally
    plt.savefig(picture_filename + '.png')

In [25]:

# saving segment pie chart
make_pie_chart_and_save(current_year_segment_agg, 'segment_pie_chart', 'Segment')

# saving ship mode pie chart
make_pie_chart_and_save(current_year_ship_mode_agg, 'ship_mode_pie_chart', 'Ship Mode')

Now we can add the saved pie charts to the document as pictures. To set it to the same horizontal level, we first make a run and then wrap them into it.

In [26]:

# add paragraph for the pie charts
p_pie_charts = report_doc.add_paragraph(f'The pie charts below shows the proportions of the ')
p_pie_charts.add_run('Segments ').bold = True
p_pie_charts.add_run('and the ')
p_pie_charts.add_run('Shipping Modes ').bold = True
p_pie_charts.add_run(f'used for the year {current_year}:')
# add pie charts
pie_charts_run = report_doc.add_paragraph().add_run()
pie_charts_run.add_picture("segment_pie_chart.png", width=Inches(2.8))
pie_charts_run.add_picture("ship_mode_pie_chart.png", width=Inches(2.8))

Out[26]:

<docx.shape.InlineShape at 0x7f244c3b5d30>

5. Category/Types of products sold:

5.1. First we aggregate them using the previously built function.

In [27]:

# getting the aggregates
current_year_products_agg = get_aggregate(current_year_data, 'Category')

5.2. Then we make a function to make and save a bar chart.

In [28]:

def plot_bar_chart_and_save(input_series: pd.Series, picture_filename: str, my_colour: str = '#F8A331'):    
    # can be changed
#     my_colour = '#F8A331'
    
    ax = input_series.plot(rot=5, kind='bar', color=my_colour, title='Count of Category/Types of products sold')
    ax.bar_label(ax.containers[0])
    fig = ax.get_figure()
    fig.savefig(picture_filename + '.png')

In [29]:

# saving category bar chart
plot_bar_chart_and_save(current_year_products_agg, 'products_bar_chart')

Once we have everything ready, we can add the saved bar chart to the document as pictures.

In [30]:

# add a subtitle
report_doc.add_heading('Types of products:', level=1)
# add paragraph for the bar chart
p_products = report_doc.add_paragraph(f'The bar chart below shows the counts of the different types of products sold for 
the year {current_year}.')
# add bar charts
report_doc.add_picture('products_bar_chart.png', width=Inches(5.5))

Out[30]:

<docx.shape.InlineShape at 0x7f244c1cd670>

We are nearly done. Let’s add a brief conclusion to wrap this up.

In [31]:

if percentage_change_total_sales > 0:
    conclusion_text = f'an increase in total sales for the year {current_year}. Well done to the team and keep up the good work!'
else:
    conclusion_text = f'a loss for the year {current_year}. Please keep on pushing to meet the targets!'

p_conclusion = report_doc.add_paragraph('To conclude, it can be noted that the company made ')
p_conclusion.add_run(conclusion_text)

Out[31]:

<docx.text.run.Run at 0x7f244c1cd2b0>

Okay I think we are done now! Oh wait, we need to save it!

In [32]:

# save document locally
report_doc.save('yearly_report_2017.docx')

Full script below (as a general function):

In [33]:

def generate_yearly_report(current_year: int, current_year_data: pd.DataFrame, previous_year_data: pd.DataFrame):
   
    # creating the document
    report_doc = Document()
    # adding a heading of level=0
    report_doc.add_heading(f'Yearly report {current_year}', level=0)

    # add a short intro paragraph
    p_intro = report_doc.add_paragraph(f'This is the yearly report for the year {current_year}. We will analyse different 
    metrics for sales, customer attributes, shipping modes and types of products.')

    # use function to get total sales values
    current_total_sales = get_yearly_sales(current_year_data)
    previous_total_sales = get_yearly_sales(previous_year_data)
    # calculate the percentage change in the total sales
    percentage_change_total_sales = ((current_total_sales - previous_total_sales) / previous_total_sales * 100).round(1)

    # let's spice it up a bit
    if percentage_change_total_sales > 0:
        percentage_change_text = f"there has been an increase of {percentage_change_total_sales}% in total sales."
    elif percentage_change_total_sales < 0:
        percentage_change_text = f"there has been an decrease of {abs(percentage_change_total_sales)}% in total sales."
    else:
        percentage_change_text = f"there has been no change in the total sales."

    # add a subheading
    report_doc.add_heading('Sales:', level=1)

    # add paragraph for the total sales
    p_yearly_sales = report_doc.add_paragraph(f'The total sales for the year {current_year} is ')
    p_yearly_sales.add_run(f"$ {current_total_sales}").bold = True
    p_yearly_sales.add_run(f'. Compared to the year {current_year-1}, ')           
    p_yearly_sales.add_run(percentage_change_text)  # this part of the text will depend on the conditions set above

    # use function to get the monthly sales
    current_monthly_sales_df = get_monthly_sales(current_year_data)
    previous_monthly_sales_df = get_monthly_sales(previous_year_data)
    # add an extra column in 'current_monthly_sales_df' to show the percentage difference (rounded to 1dp) when compared to 
    last year current_monthly_sales_df['Percentage difference'] = ((current_monthly_sales_df['Sales'] - previous_monthly_sales_df['Sales']) / 
    previous_monthly_sales_df['Sales'] * 100).round(1)
    # add '%' sign after the percentages
    # add '$' sign before the monthly sales values
    current_monthly_sales_df['Percentage difference'] = current_monthly_sales_df['Percentage difference'].astype(str) + '%'
    current_monthly_sales_df['Sales'] = '$' + current_monthly_sales_df['Sales'].astype(str)

    # add paragraph for monthly sales
    p_monthly_sales = report_doc.add_paragraph(f'The table below shows the monthly sales for the year {current_year} and the 
    percentage change compared to last year.')

    # to add the table to the document, simply run the function
    report_doc = add_table_to_word_doc(current_monthly_sales_df, report_doc)

    # leave a space after table
    p_break = report_doc.add_paragraph()

    # getting the aggregates
    current_year_segment_agg = get_aggregate(current_year_data, 'Segment')
    current_year_ship_mode_agg = get_aggregate(current_year_data, 'Ship Mode')

    # saving segment pie chart
    make_pie_chart_and_save(current_year_segment_agg, 'segment_pie_chart', 'Segment')

    # saving ship mode pie chart
    make_pie_chart_and_save(current_year_ship_mode_agg, 'ship_mode_pie_chart', 'Ship Mode')

    # add paragraph for the pie charts
    p_pie_charts = report_doc.add_paragraph(f'The pie charts below shows the proportions of the ')
    p_pie_charts.add_run('Segments ').bold = True
    p_pie_charts.add_run('and the ')
    p_pie_charts.add_run('Shipping Modes ').bold = True
    p_pie_charts.add_run(f'used for the year {current_year}:')
    # add pie charts
    pie_charts_run = report_doc.add_paragraph().add_run()
    pie_charts_run.add_picture("segment_pie_chart.png", width=Inches(2.8))
    pie_charts_run.add_picture("ship_mode_pie_chart.png", width=Inches(2.8))

    # getting the aggregates
    current_year_products_agg = get_aggregate(current_year_data, 'Category')

    # saving category bar chart
    plot_bar_chart_and_save(current_year_products_agg, 'products_bar_chart')

    # add a subtitle
    report_doc.add_heading('Types of products:', level=1)
    # add paragraph for the bar chart
    p_products = report_doc.add_paragraph(f'The bar chart below shows the counts of the different types of products sold for 
    the year {current_year}.')
    # add bar charts
    report_doc.add_picture('products_bar_chart.png', width=Inches(5.5))

    if percentage_change_total_sales > 0:
        conclusion_text = f'an increase in total sales for the year {current_year}. Well done to the team and keep up the good work!'
    else:
        conclusion_text = f'a loss for the year {current_year}. Please keep on pushing to meet the targets!'

    p_conclusion = report_doc.add_paragraph('To conclude, it can be noted that the company made ')
    p_conclusion.add_run(conclusion_text)

    # save document locally
    report_doc.save('Yearly_Report_' + str(current_year) + '.docx')

In [34]:

generate_yearly_report(2017, data_2017, data_2016)

Conclusion:

Thank you all for spending your time reading my blog and trying it out. I hope you found it useful. Please note that the scenario that I used in the blog is quite a basic one. It can be made more complex, therefore providing more informed reports.