Skip to content

Excel xlsx file not supported in xlrd

Problem

When attempting to read Excel files using pandas.read_excel() with the xlrd library, you may encounter the error:

xlrd.biffh.XLRDError: Excel xlsx file; not supported

This error commonly occurs when:

  • Reading .xlsx or .xlsm (macro-enabled) Excel files
  • Using newer versions of xlrd (1.2.0+)
  • Running code in production environments like Pivotal Cloud Foundry (PCF)

The root cause is that xlrd 2.0.0+ no longer supports .xlsx files, only the legacy .xls format.

Solutions

The optimal solution is to use the openpyxl engine, which properly supports both .xlsx and .xlsm files:

python
import pandas as pd
import os

df1 = pd.read_excel(
    os.path.join(APP_PATH, "Data", "aug_latest.xlsm"),
    engine='openpyxl'
)

Prerequisites

Make sure you have the required packages installed:

sh
pip install pandas openpyxl

For optimal compatibility, ensure you're using:

  • pandas >= 1.0.1 (preferably the latest version)
  • openpyxl >= 3.0.0

Alternative Engines

If openpyxl doesn't meet your needs, consider these alternatives:

sh
# For Excel Binary (.xlsb) files
pip install pyxlsb
sh
# For advanced Excel integration
pip install xlwings

Security Warning

Using xlrd 1.2.0 is not recommended due to potential security vulnerabilities. Only consider this if absolutely necessary and with proper risk assessment.

sh
pip install xlrd==1.2.0

Best Practices

  1. Always specify the engine parameter when reading Excel files:
python
# For .xlsx files
df = pd.read_excel('file.xlsx', engine='openpyxl')

# For .xls files  
df = pd.read_excel('file.xls', engine='xlrd')
  1. Check file extensions and use appropriate engines:
python
import os

filename = 'data.xlsm'
extension = os.path.splitext(filename)[1].lower()

if extension in ['.xlsx', '.xlsm']:
    engine = 'openpyxl'
elif extension == '.xls':
    engine = 'xlrd'
elif extension == '.xlsb':
    engine = 'pyxlsb'
else:
    raise ValueError(f"Unsupported file format: {extension}")

df = pd.read_excel(filename, engine=engine)

Why This Happened

The xlrd library dropped support for .xlsx files in version 2.0.0 to focus on security improvements and maintain the legacy .xls format support. This change was clearly documented in:

  • The xlrd release notes
  • Library documentation with prominent warnings
  • PyPI project description

Production Deployment

For cloud environments like PCF, ensure your requirements.txt includes:

pandas>=1.2.0
openpyxl>=3.0.0

Avoid pinning to older, insecure versions of xlrd, as this may introduce security risks in your production applications.

Summary

File TypeRecommended EngineAlternative Engine
.xlsxopenpyxl-
.xlsmopenpyxl-
.xlsxlrd-
.xlsbpyxlsb-

Always use the appropriate engine for your Excel file format and avoid deprecated xlrd versions to ensure both functionality and security in your applications.