DataFrame Append AttributeError in pandas
Problem Statement
When working with pandas DataFrames, you might try to add a new row using the .append()
method and encounter the error:
AttributeError: 'DataFrame' object has no attribute 'append'
This error typically occurs because:
- You're using pandas 2.0 or newer, where
.append()
was formally removed - Your code previously worked in older pandas versions (≤1.5.x)
- You're trying to append data row-by-row in a loop
Example of error-causing code:
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': ['x', 'y']})
new_row = {'A': 3, 'B': 'z'}
# This will raise AttributeError in pandas ≥2.0
df = df.append(new_row, ignore_index=True)
Why append() Was Removed
The .append()
method was deprecated in pandas 1.4 and completely removed in pandas 2.0. This change occurred because:
- Performance issues:
.append()
created a new DataFrame each time it was called, leading to O(n²) complexity - Misleading analogy: It resembled Python's
list.append()
but worked differently (created new objects instead of modifying in-place) - Better alternatives exist: More efficient methods for DataFrame expansion are available
IMPORTANT
Do not use private _append()
method as suggested in some answers. This is an internal pandas method not intended for public use and may change without warning.
Recommended Solutions
For Single Row Addition: Use concat()
df = pd.concat([df, pd.DataFrame([new_row])], ignore_index=True)
Explanation:
- Wrap the new row in a list
[new_row]
to create a single-row DataFrame - Use
pd.concat()
to combine DataFrames ignore_index=True
resets the index automatically
For Single Row Addition (Alternative): Use loc[] with RangeIndex
# Only works when index is 0-based contiguous integers
df.loc[len(df)] = new_row
LIMITATIONS
This approach only works if:
- Your index is a default
RangeIndex
(0, 1, 2, ...
) - You have no duplicate indices
- You're adding a single row
Best Practice for Adding Multiple Rows
When adding rows in a loop never append in each iteration. Instead:
new_rows = [] # Collect data here
for item in data_source:
# Process item and create row dictionary
new_row = process_item(item)
new_rows.append(new_row)
# Convert to DataFrame once and concatenate
new_df = pd.DataFrame(new_rows)
result = pd.concat([df, new_df], ignore_index=True)
Performance comparison (relative time to append 10,000 rows):
Method | Relative Time |
---|---|
Build list first → concat once | 1× (baseline) |
Append via .loc in loop | ~1600× slower |
Using concat in loop | ~800× slower |
Why Avoid Append in Loops?
Performance decreases exponentially because each append operation:
- Copies all existing data
- Allocates new memory
- Creates a new DataFrame object
- Requires garbage collection of old objects
import pandas as pd
import perfplot
def concat_loop(lst):
df = pd.DataFrame(columns=['A', 'B'])
for dic in lst:
df = pd.concat([df, pd.DataFrame([dic])], ignore_index=True)
return df
def concat_once(lst):
return pd.DataFrame(lst)
def loc_loop(lst):
df = pd.DataFrame(columns=['A', 'B'])
for dic in lst:
df.loc[len(df)] = dic
return df
perfplot.plot(
setup=lambda n: [{'A': i, 'B': 'a'*(i%5+1)} for i in range(n)],
kernels=[concat_loop, concat_once, loc_loop],
labels=['concat in loop', 'concat once', 'loc in loop'],
n_range=[2**k for k in range(16)],
xlabel='Rows added',
title='Adding rows to DataFrame',
relative_to=1
);
Migration Example
Old approach (deprecated):
results = pd.DataFrame()
for file in files:
data = parse_file(file)
results = results.append(data, ignore_index=True)
Updated solution:
all_data = []
for file in files:
data = parse_file(file)
all_data.append(data)
results = pd.concat([pd.DataFrame(d) for d in all_data],
ignore_index=True)
REAL-WORLD SCENARIO
When web scraping, data processing, or reading multiple files:
- Collect raw data in Python lists/dictionaries
- Convert to DataFrame once after collection completes
- Use vectorized operations whenever possible
Summary of Best Practices
- Single row:
pd.concat([original_df, pd.DataFrame([new_row])])
- Batch operations: Accumulate data in built-in Python containers → convert to DataFrame once
- Reading external data:
- Collect file contents in list → create DataFrame after loop
- Use
pd.read_csv()
with list of files instead of manual concatenation
- Avoid row-wise operations: Prefer vectorized DataFrame operations when possible
The removal of .append()
encourages better performance patterns. By using proper concatenation techniques, you'll write faster, more memory-efficient pandas code.