Duplicate File Management

Find, analyze, and safely remove duplicate files to reclaim storage space

Intermediate 30 minutes

Understanding Duplicate Files

Duplicate files are identical copies of the same file stored in multiple locations. They waste storage space and can make file management confusing. FileFortress helps you identify and manage these duplicates across all your cloud storage providers.

This guide covers finding duplicates, understanding detection methods, exporting deletion lists, and safely cleaning up your storage.

Detection Methods

How FileFortress identifies duplicates

1. Name & Size Matching (Fast)

Files with the same name and size are considered potential duplicates. This method is fast but may include false positives (different files that happen to have the same name and size).

Use case: Quick overview of potential duplicates

Caution: Review carefully before deletion - not 100% accurate

2. Hash Verification (Guaranteed)

Files with identical content hashes (MD5, SHA1, etc.) are guaranteed to be exact duplicates. This method is slower but 100% accurate.

Use case: Safe deletion with confidence

Recommended: Always use --hash-verified-only for exports

Finding Duplicates

Basic duplicate detection

Interactive Mode

filefortress find duplicates

This opens an interactive menu where you can explore duplicate groups, view details, and navigate through your duplicates.

Script Mode (Non-Interactive)

filefortress find duplicates --non-interactive --view summary

Displays a summary of duplicates without interactive prompts - perfect for automation.

Keep Strategies

Deciding which file to keep

When exporting duplicates for deletion, you must choose which file to keep from each group. FileFortress offers several strategies:

oldest - Keep Earliest Modified Date

Preserves the original file based on modification date. Best for maintaining file history.

newest - Keep Latest Modified Date

Keeps the most recently modified version. Useful if files have been updated.

first - Keep First Found

Simple strategy that keeps the first file in the list. Predictable behavior.

smallest - Keep Smallest File

Useful when files have different compression levels or quality settings.

largest - Keep Largest File

Preserves highest quality version when files differ in compression.

by-remote - Keep from Specific Remote

Prioritizes files from your preferred storage provider. Requires --keep-remote option.

--keep-strategy by-remote --keep-remote "Primary Backup"

Exporting Deletion Lists

Generate files for safe cleanup

FileFortress can export lists of duplicate files to delete, allowing you to review before taking action and use external tools for deletion.

Paths Format (Simple Text List)

filefortress find duplicates --non-interactive \

  --export-format paths \

  --keep-strategy oldest \

  --hash-verified-only \

  --output-file duplicates-to-delete.txt

Generates a simple text file with one file path per line, plus comments showing which file is being kept.

JSON Format (Structured Data)

filefortress find duplicates --non-interactive \

  --export-format json \

  --keep-strategy newest \

  --hash-verified-only \

  --output-file duplicates.json

Creates a structured JSON file with complete information about each duplicate group, including which file to keep and why.

Rclone Script (Cloud Storage)

filefortress find duplicates --non-interactive \

  --export-format rclone \

  --keep-strategy oldest \

  --hash-verified-only \

  --include-keep-file \

  --output-file cleanup.sh

Generates a ready-to-run bash script with rclone delete commands for cloud storage cleanup.

PowerShell Script (Local Files)

filefortress find duplicates --non-interactive \

  --export-format powershell \

  --keep-strategy newest \

  --hash-verified-only \

  --output-file cleanup.ps1

Creates a PowerShell script with multiple deletion options (dry-run, with confirmation, or force).

Bash Script (Local Files)

filefortress find duplicates --non-interactive \

  --export-format bash \

  --keep-strategy oldest \

  --hash-verified-only \

  --output-file cleanup.sh

Generates a bash script with rm -i commands for interactive deletion on Linux/macOS systems.

Safety Best Practices

Protect your data

Always use --hash-verified-only

Only export hash-verified duplicates to ensure 100% accuracy

Review before deleting

Always manually review the exported list before executing deletions

Test with small batches

Start with a small subset to verify your process works correctly

Use --include-keep-file

Add comments showing which file is kept for easier verification

Backup before bulk deletion

Ensure you have backups before deleting large numbers of files

Example Workflows

Complete examples for common scenarios

Workflow 1: Safe Cleanup for Beginners

# Step 1: Find duplicates interactively

filefortress find duplicates


# Step 2: Export safe deletion list

filefortress find duplicates --non-interactive \

  --export-format paths \

  --keep-strategy oldest \

  --hash-verified-only \

  --include-keep-file \

  --output-file safe-to-delete.txt


# Step 3: Review the file manually

notepad safe-to-delete.txt


# Step 4: Delete files (use your preferred method)

Workflow 2: Prioritize Primary Storage

# Keep files from your primary storage, delete from backups

filefortress find duplicates --non-interactive \

  --export-format json \

  --keep-strategy by-remote \

  --keep-remote "Google Drive" \

  --hash-verified-only \

  --output-file cleanup-backups.json

Workflow 3: Preview Without Saving

# See what would be deleted without creating a file

filefortress find duplicates \

  --export-format paths \

  --keep-strategy newest \

  --hash-verified-only \

  --include-keep-file

Integration with External Tools

Using exported lists with other tools

FileFortress exports are designed to work with external deletion tools. Here are some common integrations:

Using with rclone (Cloud Storage)

# Read paths from file and delete with rclone

while IFS= read -r file; do

  [[ "$file" =~ ^# ]] && continue # Skip comments

  rclone delete "remote:$file"

done < safe-to-delete.txt

Using with PowerShell (Local Files)

# Read paths and delete with confirmation

Get-Content safe-to-delete.txt | Where-Object { $_ -notmatch '^#' } | ForEach-Object {

  if (Test-Path $_) {

    Remove-Item $_ -Confirm

  }

}

Using JSON with Custom Scripts

The JSON export format provides structured data perfect for custom automation scripts:

# Python example

import json


with open('duplicates.json') as f:

  data = json.load(f)


for group in data['groups']:

  print(f"Keeping: {group['keepFile']['path']}")

  for file in group['deleteFiles']:

    print(f" Delete: {file['path']}")

Related Guides

Learn more about FileFortress features: