- Posted on
- • Software
shuf: Shuffle lines randomly
- Author
-
-
- User
- Linux Bash
- Posts by this author
- Posts by this author
-
Shuffling Text Lines Efficiently with shuf
in Linux
In the world of Linux, efficiency is key. Whether you're a system administrator, a developer, or a data scientist, manipulating text data quickly and effectively can be crucial. One handy tool that deserves more attention is shuf
, a command-line utility that randomly shuffles the lines of a file or input stream. This is particularly useful for tasks such as generating random samples, creating randomised lists, or even setting up conditions for simulations.
What is shuf
?
shuf
is a utility in GNU Coreutils, available by default on most Linux distributions. It reads a sequence of lines from a file (or standard input), randomly permutes them, and outputs the result. It can also generate a random permutation of the numbers 1 to N, making it a versatile tool for any tasks requiring random order.
Installation of shuf
While shuf
is typically installed by default with the core utilities in many Linux distributions, there might be cases where it isn't available or needs to be manually installed. Here's how to ensure it's setup on your system:
On Debian and Ubuntu:
For Debian-based distributions like Ubuntu, you can use apt
:
sudo apt update
sudo apt install coreutils
On Fedora:
Fedora and other distributions using dnf
can install shuf
from the core utilities package:
sudo dnf install coreutils
On openSUSE:
For openSUSE, the zypper
package manager is the way to go:
sudo zypper install coreutils
How to Use shuf
Using shuf
is straightforward. Here are some practical examples to get you started:
Shuffle the lines of a text file:
shuf filename.txt
Shuffle and get only the first 5 lines: This can be useful for sampling or testing.
shuf filename.txt -n 5
Shuffle by generating numbers: You might want to shuffle a range of numbers for lottery simulations or for generating test inputs.
shuf -i 1-100 -n 5
This command shuffles numbers between 1 and 100 and outputs 5 of them.
Advanced Usage
shuf
isn't just for basic shuffling. It can be integrated into scripts and combined with other utilities for more complex tasks:
Combine
sort
withshuf
for a weighted randomness: You can process the data first by sorting (maybe based on a weighted column) and then shuffle the results.sort -nk3 filename.txt | shuf
Provide a random sample to another process: If other processes or scripts require randomly selected data, you can pipe the output of
shuf
directly.shuf filename.txt | some-other-command
Conclusion
shuf
is a versatile and powerful tool underutilized in many circles, hidden among the more commonly used text processing utilities like awk
, sed
, and grep
. Whether you're handling large datasets or need a random order for your script's input, shuf
provides a straightforward and efficient solution. So next time you reach for a Python script or another heavier tool to randomise lines, consider shuf
for its simplicity and speed. Happy shuffling!