Fixing ModuleNotFoundError: No module named 'pandas.core.indexes.numeric'

Problem Overview

When working with pandas DataFrames stored as artifacts in Metaflow (or any pickled pandas object), you may encounter this error after upgrading to pandas 2.0.0 or newer:

python

ModuleNotFoundError: No module named 'pandas.core.indexes.numeric'

The key characteristics of this issue:

Occurs when accessing DataFrame properties like df.index (not during initial unpickling)
Persists even after pandas upgrades (pip install pandas -U)
Primarily affects files pickled with pandas 1.x and loaded in 2.x environments
Caused by structural changes in pandas' internal architecture in v2.0

Recommended Solutions

Preferred Method: Use `pandas.read_pickle()`

Load files using pandas' built-in deserialization method for version compatibility:

python

import pandas as pd

# Load pickled DataFrame (works across pandas versions)
file_path = "artifacts/file.pkl"  # Replace with your actual path
df = pd.read_pickle(file_path)

Why this works:

Handles pandas internal API changes transparently
Backward compatible to pandas 0.20.3
Resolves missing module dependencies
Works with Metaflow artifacts by reading from their storage path

Fallback Solution: Compatibility Shims

For situations where pd.read_pickle() fails:

python

import pandas as pd

# Use pandas' compatibility layer for older pickles
df = pd.compat.pickle_compat.load('file.pkl')

Version Locking Approach (If Solutions Fail)

If you need to maintain legacy systems:

bash

pip install "pandas<2.0.0"   # Downgrade to latest 1.x version

Important Considerations

Metaflow-Specific Workflow:

In Metaflow flows, construct the artifact path correctly:

python

from metaflow import Flow, get_metadata
flow = Flow('YourFlowName')
run = flow.latest_successful_run
file_path = run.data.dataframe.path  # Replace 'dataframe' with your artifact name

WARNING

Using the standard pickle module directly (pickle.load() or joblib.load()) instead of pd.read_pickle() may trigger this error due to missing internal pandas module paths in v2.x.

Prevention Strategies

Pin pandas versions in dependencies:
python
```
# requirements.txt
pandas>=1.5.3,<2.0.0
```

Migrate serialization formats:

python

# Use modern formats instead of pickle
df.to_parquet("data.parquet")      # Better version compatibility
df.to_feather("data.feather")      # Faster I/O

Best Practice

Always use the same major pandas version for both pickling and unpickling operations. Major releases (1.x → 2.x) often break compatibility.

Related Posts

Fixing ModuleNotFoundError: No module named 'pandas.core.indexes.numeric' ​

Problem Overview ​

Recommended Solutions ​

Preferred Method: Use pandas.read_pickle() ​

Fallback Solution: Compatibility Shims ​

Version Locking Approach (If Solutions Fail) ​

Important Considerations ​

Prevention Strategies ​

Fixing ModuleNotFoundError: No module named 'pandas.core.indexes.numeric'

Problem Overview

Recommended Solutions

Preferred Method: Use `pandas.read_pickle()`

Fallback Solution: Compatibility Shims

Version Locking Approach (If Solutions Fail)

Important Considerations

Prevention Strategies