Raspberry Pi_Eng_10.9.4 “uniq” Command – WeOmega

Published Book on Amazon

	All of IOT Starting with the Latest Raspberry Pi from Beginner to Advanced – Volume 1
	All of IOT Starting with the Latest Raspberry Pi from Beginner to Advanced – Volume 2

출판된 한글판 도서

	최신 라즈베리파이(Raspberry Pi)로 시작하는 사물인터넷(IOT)의 모든 것 – 초보에서 고급까지 (상)
	최신 라즈베리파이(Raspberry Pi)로 시작하는 사물인터넷(IOT)의 모든 것 – 초보에서 고급까지 (하)

Original Book Contents

10.9.4 "uniq" Command

This command performs the function of removing adjacent redundant data when reading data from the input or exporting the data to the output.

[Command Format]

uniq [option] [input] [output]

[Command Overview]

■ This removes adjacent redundant row data from input or output.

■ User privilege -- Normal user.

[Detail Description]

■ Because duplication is checked only for adjacent data, if the data is not sorted, this command will not remove redundant data.

■ After duplicate data is removed, the first data remains. Therefore, it is normal to sort the data with "sort" command and then execute this command.

[Main Option]

-c, --count	prefix lines by the number of occurrences
-d, --repeated	only print duplicate lines
-f, --skip-fields=N	avoid comparing the first N fields
-i, --ignore-case	ignore differences in case when comparing
-s, --skip-chars=N	avoid comparing the first N characters
-u, --unique	only print unique lines

[Used Example]

There is a file "customer_list_dup.txt" in the "testdata" directory of the "pi" account, and the contents are as follows.

pi@raspberrypi ~/testdata $ cat customer_list_dup.txt

Microsoft

Google

IBM

Samsung

Facebook

Microsoft

Samsung

Sony

Hewlett-Packard

In the above data, "Microsoft" and "Samsung" have several data. Now run the "uniq" command on this file.

pi@raspberrypi ~/testdata $ uniq customer_list_dup.txt

Microsoft

Google

IBM

Samsung

Facebook

Microsoft

Samsung

Sony

Hewlett-Packard

In the above result, the duplication of "Samsung" data is removed, and the "Microsoft" data is displayed as it is. This is because the "uniq" command only works on adjacent data.

To solve this problem, we will first sort the data using the "sort" command, and then run the "uniq" command. This time, execute the command as follows. Then, the "Microsoft" data is also displayed with the duplicated data removed.

pi@raspberrypi ~/testdata $ sort customer_list_dup.txt | uniq

Facebook

Google

Hewlett-Packard

IBM

Microsoft

Samsung

Sony

The "uniq" command can get more various informations by using various options. If you use "-c" option, you can get the number of duplicate data together.

pi@raspberrypi ~/testdata $ sort customer_list_dup.txt | uniq -c

1 Facebook

1 Google

1 Hewlett-Packard

1 IBM

1 LG

2 Microsoft

3 Samsung

1 Sony