Skip to content

Resolving Corrupted CSV Downloads with cURL and Ampersands in URLs

Problem

When downloading CSV files using cURL from URLs containing ampersands (&), the resulting file displays unreadable characters and fails to open in applications like Excel. This typically occurs when URLs include query parameters separated by & symbols, as shown in the command:

bash
curl --output /home/../test2.csv https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv

The core issue stems from shell interpretation of unquoted characters: The & symbol is treated as a command separator in shells like Bash/Zsh, causing the URL to split and truncation of the request. This results in an incomplete download of metadata instead of the target CSV file.

Primary Solution: URL Quoting

Enclose the entire URL in double quotes (") to prevent the shell from interpreting special characters:

bash
curl --output output.csv "https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv"

Why This Works

  1. Shells treat content inside quotes as literal strings
  2. Preserves special characters like &, ?, and = that are common in download URLs
  3. Prevents truncation at &files=..., ensuring the full URL reaches the server
  4. Maintains URL-encoded sequences (%2F, %20) intact

Do For All URLs

Always quote URLs containing:

  • Spaces ()
  • Special symbols (&, ?, =, $, %)
  • Characters above ASCII 7-bit range

Enhanced Solutions

Handling Redirects

Many file services use redirects. Add -L to follow HTTP redirect headers:

bash
curl -L --output filename.csv "https://complete/url?with¶meters"

Filename Best Practices

  1. Verify the final filename after redirects:
bash
curl -L -O -J "https://example.com/download?params"
  • -J: Uses server-suggested filename
  • -O: Saves to original filename
  1. Set explicit filename extensions with --output to avoid conflicts

Session Maintenance

For cookie-based authentication, preserve sessions with:

bash
curl -L -c cookies.txt -b cookies.txt -o result.csv "https://site.com/download"

Common Pitfalls

DANGER

Corruption Symptoms indicate incomplete downloads:

  • Unexpected HTML tags in the files
  • file output.csv shows HTML or UTF-8 Unicode text
  • Size differences vs browser downloads

Debugging Steps:

  1. Test without --output to view server response:
bash
curl "https://url" | head
  1. Check headers with verbose mode:
bash
curl -v "https://url" > debug.log
  1. Compare server response from browser (via developer tools Network tab)

Confirming Successful Fix

Valid downloads will have:

  • No HTML/XML content visible when opened in a plain text editor
  • Correct file extensions (.csv for CSV files)
  • Consistent hash values across download attempts

Best Practices Summary

  1. Always quote URLs
  2. Follow redirects with -L
  3. Verify content-type headers match your expected format
  4. Use server headers for filenames with -OJ
  5. Maintain sessions for authenticated endpoints

Example production-grade command:

bash
curl -L \
  --output "UNSW-NB15_1.csv" \
  --progress-bar \
  "https://cloudstor.aarnet.edu.au/plus/s/2DhnLGDdEECo4ys/download?path=%2FUNSW-NB15%20-%20CSV%20Files&files=UNSW-NB15_1.csv"