Resolving Corrupted CSV Downloads with cURL and Ampersands in URLs
Problem
When downloading CSV files using cURL from URLs containing ampersands (&
), the resulting file displays unreadable characters and fails to open in applications like Excel. This typically occurs when URLs include query parameters separated by &
symbols, as shown in the command:
curl --output /home/../test2.csv https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv
The core issue stems from shell interpretation of unquoted characters: The &
symbol is treated as a command separator in shells like Bash/Zsh, causing the URL to split and truncation of the request. This results in an incomplete download of metadata instead of the target CSV file.
Primary Solution: URL Quoting
Enclose the entire URL in double quotes ("
) to prevent the shell from interpreting special characters:
curl --output output.csv "https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv"
Why This Works
- Shells treat content inside quotes as literal strings
- Preserves special characters like
&
,?
, and=
that are common in download URLs - Prevents truncation at
&files=...
, ensuring the full URL reaches the server - Maintains URL-encoded sequences (%2F, %20) intact
Do For All URLs
Always quote URLs containing:
- Spaces (
)
- Special symbols (
&
,?
,=
,$
,%
) - Characters above ASCII 7-bit range
Enhanced Solutions
Handling Redirects
Many file services use redirects. Add -L
to follow HTTP redirect headers:
curl -L --output filename.csv "https://complete/url?with¶meters"
Filename Best Practices
- Verify the final filename after redirects:
curl -L -O -J "https://example.com/download?params"
-J
: Uses server-suggested filename-O
: Saves to original filename
- Set explicit filename extensions with
--output
to avoid conflicts
Session Maintenance
For cookie-based authentication, preserve sessions with:
curl -L -c cookies.txt -b cookies.txt -o result.csv "https://site.com/download"
Common Pitfalls
DANGER
Corruption Symptoms indicate incomplete downloads:
- Unexpected HTML tags in the files
file output.csv
showsHTML
orUTF-8 Unicode text
- Size differences vs browser downloads
Debugging Steps:
- Test without
--output
to view server response:
curl "https://url" | head
- Check headers with verbose mode:
curl -v "https://url" > debug.log
- Compare server response from browser (via developer tools Network tab)
Confirming Successful Fix
Valid downloads will have:
- No HTML/XML content visible when opened in a plain text editor
- Correct file extensions (
.csv
for CSV files) - Consistent hash values across download attempts
Best Practices Summary
- Always quote URLs
- Follow redirects with
-L
- Verify content-type headers match your expected format
- Use server headers for filenames with
-OJ
- Maintain sessions for authenticated endpoints
Example production-grade command:
curl -L \
--output "UNSW-NB15_1.csv" \
--progress-bar \
"https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv"