Openpyxl: Master Excel with Python

Q: What is openpyxl?

openpyxl is a Python library for reading, writing, and manipulating Excel files in the .xlsx and .xlsm formats. It allows automation of common Excel tasks from Python without needing Excel installed.

Q: Can openpyxl edit existing workbooks?

Yes. You can load an existing workbook with load_workbook, modify cells, add sheets, update formulas, and save the changes back to disk.

Q: Does openpyxl support charts?

openpyxl can create basic charts using the openpyxl.chart module and attach them to worksheets, though advanced chart customizations are best done in Excel or BI tools.

Q: Is openpyxl cross-platform?

Yes. openpyxl runs on Windows, macOS, and Linux as a pure-Python library, provided Python and dependencies are installed.

Q: What are common pitfalls?

Common issues include mismanaging file paths, mixing read/write modes, and assuming in-memory operations for large files. Use read_only or write_only modes for big datasets.

Q: Do formulas update automatically?

Formulas stored in Excel updates when the workbook is opened in Excel. When opened by other apps, you may need to recalculate.

Openpyxl practical guide for Python: read, write, and format Excel workbooks with code samples. Covers installation, core APIs, and real-world automation workflows.

XLS Library Team

March 16, 2026·5 min read

Data Cleaning Techniques Excel Tips Vlookup Excel Formulas Index Match

Openpyxl Essentials - XLS Library — Photo by Pexelsvia Pixabay

Quick AnswerDefinition

Openpyxl is a Python library that enables reading, writing, and formatting Excel workbooks (.xlsx and .xlsm) without Excel installed. It’s ideal for data pipelines, automation tasks, and reproducible analyses. This guide from XLS Library shows installation, core concepts, and practical workflows to automate Excel tasks with Python using openpyxl.

Why openpyxl fits Python data pipelines

According to XLS Library, openpyxl is a mature, well-documented library for manipulating Excel workbooks from Python. It provides a clean, object-oriented API to create sheets, populate cells, apply styles, and export finished files. This fits data pipelines where Excel is the input or output format, or where you need reproducible reports generated by code. While it is not a full Excel replacement, openpyxl streamlines repetitive tasks, reduces manual errors, and enables automation from data extraction to presentation-ready workbooks.

Python

from openpyxl import Workbook

wb = Workbook()
ws = wb.active
ws.title = "Data"
ws['A1'] = "Sample"
ws['B1'] = 42
wb.save("sample.xlsx")

What this code does: creates a workbook, sets a sheet title, writes simple values, and saves the file. This is the entry point for more advanced data workflows using openpyxl.

Getting started: install and basic workbook creation

To begin, install openpyxl via pip and verify the installation. The following commands work across platforms (Windows, macOS, Linux).

Bash

pip install openpyxl

Python

# Quick check to confirm installation and version
import openpyxl
print(openpyxl.__version__)

Why this matters: a successful installation ensures you can access the API without relying on Excel being installed. This paves the way for scripting, testing, and automating Excel-related tasks in Python projects.

When you’re ready to create a workbook from scratch, you’ll typically start with Workbook() and active sheet access, as shown above.

Creating a workbook and writing data

Openpyxl makes it straightforward to populate rows and columns. The following example writes a small table and saves it as data.xlsx.

Python

from openpyxl import Workbook

wb = Workbook()
ws = wb.active
ws.title = "Sales"

headers = ["Month", "Revenue"]
ws.append(headers)
for i, m in enumerate(["Jan","Feb","Mar"], start=1):
    ws.append([m, i * 1000])

wb.save("data.xlsx")

What happens here: we create a sheet named Sales, append a header row, then loop to fill data. Using append is convenient for building rows from lists. If you prefer cell-by-cell control, you can assign values with ws.cell(row=r, column=c, value=...).

One common pattern is building data structures (lists of dictionaries) and writing them with a small helper function to map keys to columns.

Reading data from Excel using openpyxl

Reading data is often as important as writing it. The following snippet loads an existing workbook in read mode and prints the first column. It uses a safe approach to iterate rows and extract values.

Python

from openpyxl import load_workbook

wb = load_workbook("data.xlsx", read_only=True, data_only=True)
ws = wb.active

for row in ws.iter_rows(min_row=2, max_col=2, values_only=True):
    month, revenue = row
    print(month, revenue)

Notes:

read_only=True reduces memory usage for large files.
data_only=True returns calculated values when formulas exist.
iter_rows with values_only=True yields tuples of cell values for easy processing.

If you need to preserve formulas, omit data_only or access the formula through cell.value on a non-read-only workbook.

Iterating rows and columns efficiently

For larger datasets, explicit cell access can be slower. A recommended approach is to use iter_rows with values_only=True to stream data, or use ws.values to iterate across all rows. This snippet demonstrates streaming a simple transformation:

Python

from openpyxl import load_workbook

wb = load_workbook("data.xlsx", read_only=True, data_only=True)
ws = wb.active

# Transform: create a new list of dicts with column headers as keys
headers = [cell.value for cell in next(ws.iter_rows(min_row=1, max_row=1, values_only=True))]
data = []
for row in ws.iter_rows(min_row=2, values_only=True):
    data.append(dict(zip(headers, row)))
print(data[:3])

This pattern helps when converting Excel data into Python-native structures for analysis or further processing. If memory permits, you can also write transformed data directly to a new workbook or a CSV file for downstream pipelines.

Formatting cells and applying styles

Openpyxl supports styling to improve readability of generated reports. You can set fonts, colors, alignment, and fills. The following example demonstrates a bold header row with a blue fill:

Python

from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill

wb = Workbook()
ws = wb.active
ws.append(["Month", "Revenue"])
ws.append(["Jan", 12000])

# Style header row (first row)
header = ws[1]
for cell in header:
    cell.font = Font(bold=True, color="FFFFFF")
    cell.fill = PatternFill(start_color="4F81BD", end_color="4F81BD", fill_type="solid")

wb.save("styled.xlsx")

Why styling matters: styled workbooks are easier to read and share with non-technical stakeholders. A common approach is to separate data, headers, and totals using color, borders, and bold fonts. You can also apply number formatting (e.g., decimals, currency) via openpyxl.styles.numbers or NamedStyle for consistency across sheets.

Formulas, charts, and images: extending Excel with openpyxl

Openpyxl can populate formulas and, to some extent, embed charts into workbooks. The following example writes a simple sum formula and adds a small bar chart to visualize values:

Python

from openpyxl import Workbook
from openpyxl.chart import BarChart, Reference

wb = Workbook()
ws = wb.active
ws.append(["Month", "Sales"])
ws.append(["Jan", 1000])
ws.append(["Feb", 1500])
ws.append(["Mar", 1200])

# Write a formula in C2 that sums B2:B4
ws['C2'] = "=SUM(B2:B4)"

# Create and attach a simple bar chart
chart = BarChart()
data = Reference(ws, min_col=2, min_row=1, max_row=4)
chart.add_data(data, titles_from_data=True)
ws.add_chart(chart, "E2")

wb.save("charts.xlsx")

Caveats: openpyxl can generate charts but rendering options are limited compared to Excel’s UI. Formulas are stored as strings and evaluated by Excel when the file is opened. For complex charts, consider exporting data and using Excel’s native chart tools or BI tools for visualization.

Working with large files and memory considerations

When dealing with big datasets, avoid loading entire files into memory. Use write-only mode to append data efficiently, or process data in chunks. The example below demonstrates a streaming write pattern that scales well for millions of rows:

Python

from openpyxl import Workbook

wb = Workbook(write_only=True)
ws = wb.create_sheet()

# Simulate streaming data
for i in range(1_000_0):
    ws.append([i, i * 2 if i % 2 == 0 else i / 2])

wb.save("large.xlsx")

If you must read very large files, prefer read_only=True and iterate with iter_rows as shown earlier to keep memory usage in check. For stable automation, consider batching writes to separate output files and validating integrity after each batch.

Debugging and common pitfalls

Common issues with openpyxl involve mixing read/write modes, misaligned indices, and mismatched data types. A cautious approach is to validate input data before writing and to handle exceptions gracefully. The snippet below demonstrates basic error handling when loading a workbook:

Python

from openpyxl import load_workbook

try:
    wb = load_workbook("data.xlsx")
except FileNotFoundError:
    print("The file was not found. Create it first or provide a valid path.")
except Exception as e:
    print("An unexpected error occurred:", e)

Best practice: keep a small tests directory with sample workbooks to validate scripts during development. This reduces surprises when run in CI or automated pipelines.

Steps

Estimated time: 30-60 minutes

1
Install and verify environment
Install the openpyxl package with pip and confirm the installation by printing the version. This ensures the Python environment is prepared for workbook operations.
Tip: Use a virtual environment to isolate dependencies.
2
Create a workbook and write data
Initialize a workbook, create a data sheet, and populate a few rows. Saving the file produces a tangible Excel workbook to inspect.
Tip: Prefer `append` for simple row-wise additions.
3
Read data back for validation
Load the workbook in read-only mode when possible and iterate rows to verify content matches expectations.
Tip: Use `values_only=True` to simplify value access.
4
Apply formatting and formulas
Add a header style and include simple formulas. This demonstrates how to enrich data in code.
Tip: Formulas are stored as strings and evaluated by Excel.
5
Persist and test
Save the final workbook and re-open to verify styles, data types, and formulas render as intended.
Tip: Automate a quick re-check to catch regressions.

Warning: Avoid overwriting important files; write to a new filename or backup originals.

Pro Tip: Use write_only for large datasets to minimize memory usage.

Note: When sharing generated workbooks, ensure formulas reference correct cell ranges.

Prerequisites

Required

Python 3.8+↗
Required
pip package manager
Required
openpyxl library↗
Required
Basic command line knowledge
Required
Access to Excel or Excel-compatible files
Required

Optional

Code editor or IDE
Optional

Commands

Action	Command
Install openpyxlUpgrade with pip install --upgrade openpyxl	`pip install openpyxl`
Run a Python scriptEnsure you're in the script's directory	`python script.py`
Check installed version	`python -c 'import openpyxl; print(openpyxl.__version__)'`
Create a quick workbook inlineInline here-doc for quick experiments	`python - << 'PY' from openpyxl import Workbook wb = Workbook() wb.save('sample.xlsx') PY`
Upgrade openpyxl	`pip install --upgrade openpyxl`

The Essentials

Install openpyxl and verify version
Create workbooks and write data with append
Read data efficiently with iter_rows
Apply styling to improve readability
Use formulas and charts for insights

← More in Excel Tips & Shortcuts

Openpyxl: Master Excel with Python

Why openpyxl fits Python data pipelines

Getting started: install and basic workbook creation

Creating a workbook and writing data

Reading data from Excel using openpyxl

Iterating rows and columns efficiently

Formatting cells and applying styles

Formulas, charts, and images: extending Excel with openpyxl

Working with large files and memory considerations

Debugging and common pitfalls

Steps

Install and verify environment

Create a workbook and write data

Read data back for validation

Apply formatting and formulas

Persist and test

Prerequisites

Commands

People Also Ask

The Essentials

Related Articles