You’ve been copying files with cp for years, and if you’re moving a 50GB backup or syncing a directory tree to a remote server, that habit is quietly costing you time, visibility, and recoverability every single day.

The cp command does exactly one thing well: it copies files, but it gives you no progress indicator, no rate limiting, no resume support, and no built-in checksum verification.

On a local copy of a few megabytes that’s fine, but the moment you’re pushing a 40GB database dump across a network link or copying 200,000 small files to a new disk, you want more than a blinking cursor and a silent prayer.

Why cp Falls Short on Large Copies

cp is a POSIX standard, so it’s always there, but it was built for simplicity and not for bulk data operations. It reads a file and writes it sequentially with no parallelism, no delta logic, and no feedback to the terminal.

If the process gets interrupted, such as a power cut, SSH timeout, accidental Ctrl+C you start over completely, because there’s no resume.

And if you’re copying to a remote host, you’re doing it through a separate step like scp, which has the same all-or-nothing behavior, and adds encryption overhead even when you don’t need it on a trusted LAN.

If you’re regularly moving large datasets between servers and still reaching for cp, share this with your team – the tools below will save someone a 2 am do-over.

rsync: The Go-To Tool for Resumable File Transfers

rsync is the first tool to learn when cp isn’t enough, because it copies only the differences between source and destination, supports resume, and works both locally and over SSH.

Install it if it’s not already present:

sudo apt install rsync         [On Debian, Ubuntu and Mint]
sudo dnf install rsync         [On RHEL/CentOS/Fedora and Rocky/AlmaLinux]
sudo apk add rsync             [On Alpine Linux]
sudo pacman -S rsync           [On Arch Linux]
sudo zypper install rsync      [On OpenSUSE]    
sudo pkg install rsync         [On FreeBSD]

The sudo prefix runs the command with root privileges, which is needed for installing packages. For basic local file copies, you won’t need sudo, but syncing system directories will require it.

A standard local directory copy looks like this:

rsync -av --progress /source/directory/ /destination/directory/

Output:

sending incremental file list
database/
database/dump_2024.sql
    2,147,483,648 100%   98.45MB/s    0:00:20 (xfr#1, to-chk=0/2)

sent 2,147,483,909 bytes  received 35 bytes  102.24MB/s total size is 2,147,483,648

Breaking down the flags:

  • -a enables archive mode, which preserves permissions, timestamps, symlinks, and recursive directory structure in a single flag.
  • -v prints each file name as it transfers.
  • --progress shows a live per-file transfer rate and percentage.

The trailing slash after /source/directory/ matters: with a trailing slash, rsync copies the contents of the directory. Without it, rsync copies the directory itself as a subdirectory inside the destination. Get that wrong, and you’ll end up with /destination/directory/directory/ instead of what you expected — a common first-time mistake.

To copy to a remote server over SSH, the syntax is nearly identical:

rsync -av --progress /local/path/ user@remote-ip:/remote/path/

Replace remote-ip with your server’s IP address, which you can find with ip a.

ip a

If the transfer drops halfway, run the same command again and rsync picks up exactly where it left off, skipping files that already transferred successfully.

Going deeper on rsync is worth the time – the SSH Course on Pro TecMint covers SSH-based transfers, key auth, and remote rsync patterns across 54 chapters.

pv: Add Progress Bars to File Transfers

pv (Pipe Viewer) is a small utility that sits inside a Unix pipe and shows transfer speed, elapsed time, and estimated completion. It doesn’t replace cp or rsync, but it wraps them.

Install it:

sudo apt install pv         [On Debian, Ubuntu and Mint]
sudo dnf install pv         [On RHEL/CentOS/Fedora and Rocky/AlmaLinux]
sudo apk add pv             [On Alpine Linux]
sudo pacman -S pv           [On Arch Linux]
sudo zypper install pv      [On OpenSUSE]    
sudo pkg install pv        [On FreeBSD]

The simplest use is copying a single large file with a live progress bar:

pv /source/large-file.iso > /destination/large-file.iso

Output:

8.35GiB 0:01:22 [ 104MiB/s] [=========>          ] 63% ETA 0:00:47

That output shows you exactly how fast the disk is actually writing, which is something you’d never get from a bare cp. You can also pipe pv into compression for an archive-and-copy in one shot:

pv /source/large-file.tar | gzip > /destination/large-file.tar.gz

Breaking down the pipeline:

  • pv /source/large-file.tar reads the source file and reports throughput to your terminal.
  • gzip compresses the stream in real time.
  • > /destination/large-file.tar.gz writes the compressed output to the destination.

dd: The Power Tool for Disk Cloning and Raw Copies

dd is a lower-level tool and it’s already installed on every Linux system. It reads and writes raw blocks, which makes it the right tool for cloning a full disk or partition, creating disk images, and testing raw disk throughput.

The risk with dd is that a typo in the output path can wipe a wrong disk with no warning, so always double-check your target before running it.

A typical disk-to-disk clone looks like this:

sudo dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync status=progress

Output:

50033664512 bytes (50 GB, 47 GiB) copied, 623.847 s, 80.2 MB/s

Breaking down the flags:

  • if=/dev/sda sets the input file, which is the source disk.
  • of=/dev/sdb sets the output file, which is the destination disk, make sure to confirm this is the right device with lsblk before running.
  • bs=64K sets the block size to 64 kilobytes, which is significantly faster than the default 512-byte block size for large sequential reads.
  • conv=noerror,sync tells dd to continue past read errors and fill bad blocks with zeros rather than stopping the entire copy.
  • status=progress prints live throughput every few seconds, which was added in coreutils 8.24 – on older systems, you won’t have this flag, and you’ll need to send a USR1 signal manually to get a progress report.

Warning: dd does not ask for confirmation. If you swap if and of, you write your source disk to the destination and destroy the data you meant to copy.

If this saved you from a painful dd mistake, pass it along to someone who’s just starting to work with disk images.

parallel + rsync: Faster Copying for Millions of Tiny Files

rsync is fast for large files but single-threaded per transfer. When you have a directory with hundreds of thousands of small files – think a Node.js node_modules directory, a mail spool, or a photo library – rsync can take far longer than expected because the per-file overhead dominates over actual data transfer time.

GNU Parallel solves this by running multiple rsync jobs simultaneously.

sudo apt install parallel         [On Debian, Ubuntu and Mint]
sudo dnf install parallel         [On RHEL/CentOS/Fedora and Rocky/AlmaLinux]
sudo apk add parallel             [On Alpine Linux]
sudo pacman -S parallel           [On Arch Linux]
sudo zypper install parallel      [On OpenSUSE]    
sudo pkg install parallel         [On FreeBSD]

Then run parallel rsync across a large directory tree:

find /source/directory -mindepth 1 -maxdepth 1 -type d | \
  parallel -j 4 rsync -a {} /destination/directory/

Breaking down the pipeline:

  • find /source/directory -mindepth 1 -maxdepth 1 -type d lists the top-level subdirectories of the source.
  • parallel -j 4 runs 4 rsync jobs simultaneously, one per subdirectory, so adjust -j to match your CPU count and disk speed.
  • rsync -a {} /destination/directory/ syncs each subdirectory to the destination, with {} replaced by each directory name.

On a directory with 500,000 small files, this approach routinely cuts copy time by 60 to 70 percent compared to a single rsync call, because the I/O queue stays full instead of waiting on one file at a time.

The 100+ Essential Linux Commands course on Pro TecMint covers find, pipes, and command-line composition in detail if you want to get comfortable building pipelines like this one.

Verify File Integrity with SHA256 Checksums

None of these tools matters much if you don’t verify the copy actually succeeded cleanly. For any critical copy, run a checksum comparison after the transfer completes.

SHA256 is the right choice for most purposes:

sha256sum /source/large-file.iso /destination/large-file.iso

Output:

a3b4c1d2e5f6...  /source/large-file.iso
a3b4c1d2e5f6...  /destination/large-file.iso

If both hashes match, the copy is byte-perfect. If they differ, something went wrong during transfer, such as disk error, network corruption, or a race condition with another process writing to the source and you need to copy again before trusting that data.

Conclusion

cp is fine for moving a config file from one directory to another, but for real sysadmin work, for example, large backups, remote syncs, disk clones, and directories with millions of inodes.

You must use rsync, which gives you resume and delta transfer, pv gives you visibility, dd gives you block-level control, and parallel rsync gives you throughput on small-file-heavy directories.

The best thing to try right now: pick a large directory on your system and copy it once with cp, then again with rsync -av --progress, and compare the output and timing. You’ll immediately see what you’ve been missing, and the muscle memory for rsync will start building from there.

What’s your go-to tool for bulk file copies in production? And have you run into a scenario where none of these were enough, and you had to reach for something else? Drop it in the comments.

Share.
Leave A Reply