Удалить дубли строк notepad

Удалить дубли строк notepad

Надо удалить повторяющиеся строки в текстовом файле. Для этого воспользуемся Notepad++.

19.10.2014 9 комментариев 119 584 просмотров

Надо удалить повторяющиеся строки в текстовом файле. Для этого воспользуемся Notepad++.

Прямой функции в Notepad++ нет, но можно воспользоваться некоторыми функциями, чтобы всё реализовать.

Допустим, что у нас есть файл такого содержания:

Первый способ

Запустите окно замены в файле и введите команду

При этом настройки замены должны быть как на рисунке:

И нажмите Заменить всё . Повторяющиеся строки удаляться. Но при этом останутся не первые варианты строк, а последние повторы.

Второй способ

Если надо удалить повторы так, чтобы оставалось первые варианты строк, а не последние, то тут надо по другому поступить. Идея простая. Мы меняем порядок строк, а потом просто применяем первый способ, а потом меняем обратно.

Для этого нам потребуется плагин TextFX. По ссылке рассказывается и про его установку.

Итак, нужно сделать следующие действия для изменения порядка строк.

Выделите весь текст Ctrl + A .

Вставьте номера строкам: TextFX → TextFX Tools → Insert Line Numbers .

Если стоит флажок TextFX → TextFX Tools → +Sort ascending , то его убрать.

Отсортируем строки TextFX → TextFX Tools → Sort lines case sensitive (at column) .

Удаляем номера строк TextFX → TextFX Tools → Delete Line Numbers or First Word .

Потом используем первый способ для удаление повторяющихся строк. А потом обратно меняем порядок строк.

Третий способ

Но я бы всё-таки для таких целей использовал бы специализированные средства (ибо, иногда способы в статье немного шалят). Вот два рабочий сервиса, которыми я пользуюсь при случае:

Необходимо установить плагин TextFX.

Как его установить:

Выбрать в меню "Плагины —> Plugin Manager —> Show Plugin Manager".

После чего в открывшемся окне, на первой вкладке "Available" в списке выбираем нужный нам плагин TextFX Characters и нажимаем "Install".

Как удалить повторяющиеся строки?:

  1. Открываем наш текстовый документ со списком;
  2. Выделяем весь текст;
  3. Переходим в меню TextFX —> TextFX Tools —> Sort lines case insensitive**

**Проверьте чтобы была отмечена опция "Sort outputs only UNIQUE lines".

Вот так вот быстро и удобно мы удалили дублирующиеся строки в notepad++ и отсортировали список.

Is it possible to remove duplicated rows in Notepad++, leaving only a single occurrence of a line?

Создан 18 окт. 10 2010-10-18 10:42:56 UGEEN

9 ответов

Notepad++ can do this, provided you wanted to sort by line, and remove the duplicate lines at the same time.

You will need the TextFX plugin. This used to be included in older versions of Notepad++, but if you have a newer version, you can add it from the menu by going to Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX -> Install . In some cases it may also be called TextFX Characters , but this is the same thing

The check boxes and buttons required will now appear in the menu under: TextFX -> TextFX Tools .

Make sure "sort outputs only unique. " is checked. Next, select a block of text ( Ctrl + A to select the entire document). Finally, click "sort lines case sensitive" or "sort lines case insensitive"

Создан 18 окт. 10 2010-10-18 10:46:42 Colin Pickard

Incredibly powerful plugin, despite its "age". Hope they will NEVER remove that one from the standard NPP plugin offer. The guy who thought about all the features in this plug-in, was kind of a "visionary". – GeertVc 01 сен. 14 2014-09-01 09:32:31

Note that this method does not give any kind of warning if the file is read-only. My file was sorted anyway, so it seemed that the tool had worked, until I spotted a duplicate. Quite frustrating until I tried @stema’s search & replace method, which did warn me. – JV. 04 дек. 14 2014-12-04 12:44:02

Читайте также:  Как заблокировать свою страницу в контакте навсегда

More powerful than excel. – Vasu 22 апр. 15 2015-04-22 21:58:02

Textpad does it with one key — F9 hoping NP++ can also allow hotkey for this operation. – prash 12 апр. 16 2016-04-12 09:11:37

@GeertVc: was that sarcasm, zynism or something? There’s no TextFX plugin in my installation – Thomas Weller 01 июн. 16 2016-06-01 10:35:04

@Thomas: instead of accusing me of whatever crap, get up and look on the web and search for the plugin. To keep you lazy, here’s the link to it: https://sourceforge.net/projects/npp-plugins/files/TextFX/. Hope you’ll be able to install it. And yes, _that’s_ sarcasm. – GeertVc 02 июн. 16 2016-06-02 15:42:06

This is really AWESOME !! – JG’ s Spark 10 июн. 16 2016-06-10 09:37:42

NO I don’t want to sort anything – ACV 20 дек. 16 2016-12-20 10:01:04

Awesome! Thanks a lot! – pixelstuermer 20 июл. 17 2017-07-20 09:35:11

Thanks a lot! You helped me so much! – Tiago Ávila 15 ноя. 17 2017-11-15 01:14:06

What about Notepad++ x64 version? Plugin TextFX x64-version not exists – Geograph 14 янв. 18 2018-01-14 15:22:55

i am not able to check the ‘Sort outputs only’ option. What to do? – th3pirat3 09 мар. 18 2018-03-09 18:18:23

TextFx is not in the 64 bit version. – Rhyous 23 мар. 18 2018-03-23 00:27:08

If you don’t care about row order (which I don’t think you do), then you can use a Linux/FreeBSD/MacOSX/Cygwin box and do:

Then open the file again in Notepad++.

Создан 18 окт. 10 2010-10-18 10:46:11 Pablo Santa Cruz

Doesn’t work on Windows 7. »cat’ is not recognized as an internal or external command, operable program or batch file.’ – Iain Elder 11 дек. 14 2014-12-11 16:50:35

@Iain Elder: cat is a standard Unix utility, which is why this answer specifies that it works on linux, FreeBSD, and MacOSX. The answer also suggests Cygwyn: This is a windows program that gives you a unix style shell, and with it, cat. Long story short (too late!): Win 7 needs Cygwin to do this. – Travis Clark 14 янв. 15 2015-01-14 16:14:39

In windows you have powershell: ‘cat yourfile | sort -Unique’ – Elazar 05 авг. 15 2015-08-05 11:32:14

These are good examples of "the gratuitous use of cat". Forget about the cat utility and just use file redirection thusly: sort ** yourfile_nodups – scott8035 16 май. 16 2016-05-16 18:57:03

@scott8035, I agree that cat is of no use for running that command, but I find it often helpful to start with cat when figuring out a long sequence of non-obvious commands, like cat file | sed . | sed . | sed . and so on. So I’d say that there might be reasons for using cat. Of course cat can be removed at the end, but some are too lazy for that. – FORTRAN 14 сен. 17 2017-09-14 06:56:13

You can install bash now on Windows 10, just search "Ubuntu" in Microsoft Store and follow the instructions in the Description. – Patronaut 23 фев. 18 2018-02-23 14:46:18

if the rows are immediately after each other then you can use a regex replace

Search Pattern: ^(.*
?
)(1)+

Создан 18 окт. 10 2010-10-18 10:53:02 Grant Peters

Maybe others have had luck with this, but for me ^(.*
)1 results in "Cant find the text" – b1nary.atr0phy 28 апр. 12 2012-04-28 18:18:18

@b1naryatr0phy make sure you have "Search Mode" set to "Regular expression", I also updated the pattern so that it can handle windows style line endings – Grant Peters 01 май. 12 2012-05-01 13:25:27

Читайте также:  В какой последовательности проходить ведьмак 3

notepad++ has a light regex engine, it dosen’t permit advanced functios, not even the "? or
" as it only works on a single line and you use $ for the
characters – Stefan Rogin 25 май. 12 2012-05-25 16:39:33

Yeps not working – Juzer Ali 18 июн. 12 2012-06-18 06:40:01

this eliminates one by one. You must repeat it many times. I wonder why
+ ->
does not work (thought it reports many replacements) – Val 06 сен. 12 2012-09-06 09:48:19

@Val, if you make the back-reference part of the match a group with 1-or-more matches required, the pattern will match N contiguous duplicate lines at a time: ‘^(.*
?
)(1)+’ – Kenigmatic 29 апр. 16 2016-04-29 23:05:41

Works perfectly! Thanks! – Aldracor 25 окт. 16 2016-10-25 10:47:36

The latter versions of Notepad++ do not apparently include the TextFX plugin at all. In order to use the plugin for sorting/eliminating duplicates, the plugin must be either downloaded and installed (more involved) or added using the plugin manager.

A) Easy way (as described here).

Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX Characters -> Install

B) More involved way, if another version is needed or the easy way does not work.

Download the plugin from SourceForge:

Open the zip file and extract NppTextFX.dll

Place NppTextFX.dll in the Notepad++ plugins directory, such as:
C:Program FilesNotepad++plugins

Start Notepad++, and TextFX will be one of the file menu items (as seen in Answer #1 above by Colin Pickard)

After installing the TextFX plugin, follow the instructions in Answer #1 to sort and remove duplicates.

Also, consider setting up a keyboard shortcut using Settings > Shorcut mapper if you use this command frequently or want to replicate a keyboard shortcut, such as F9 in TextPad for sorting.

Создан 13 ноя. 12 2012-11-13 16:33:13 eeasterly

Since Notepad++ Version 6 you can use this regex in the search and replace dialogue:

and replace with nothing. This leaves from all duplicate rows the last occurrence in the file.

No sorting is needed for that and the duplicate rows can be anywhere in the file!

You need to check the options "Regular expression" and ". matches newline":

^ matches the start of the line.

(.*?) matches any characters 0 or more times, but as few as possible (It matches exactly on row, this is needed because of the ". matches newline" option). The matched row is stored, because of the brackets around and accessible using 1

$ matches the end of the line.

s+?^ this part matches all whitespace characters (newlines!) till the start of the next row ==> This removes the newlines after the matchd row, so that no empty row is there after the replacement.

(?=.*^1$) this is a positive lookahead assertion. This is the important part in this regex, a row is only matched (and removed), when there is exactly the same row following somewhere else in the file.

Создан 30 апр. 13 2013-04-30 06:27:55 stema

This one is better indeed than the other regex. No need for multiple passes to eliminate all duplicates. – Benny 20 июн. 13 2013-06-20 03:55:12

oh, this one is brilliant, it even deletes empty rows, i’m macroing it this very moment 🙂 – Aprillion 28 июн. 13 2013-06-28 16:14:09

Great to learn. Precise explanation too! Thanks to both the raiser and reply-er! – SarjanWebDev 29 окт. 13 2013-10-29 01:50:54

It just removes ALL lines in a file in some cases. – SerG 20 фев. 14 2014-02-20 13:56:06

Читайте также:  Держатель для телефона в автомобиль какой выбрать

Is there any way to remove the LAST occurrence? This matches all but the last one. – Cullub 23 сен. 14 2014-09-23 11:18:20

In my case where this solution removed all lines, unchecking the ‘. matches newline’ did the trick. – Kuitsi 01 дек. 15 2015-12-01 07:34:42

**Perfect!** I was using Notepad++ on a locked-down system with no internet access. No way to download plugins, so this was better for me. – ADTC 06 янв. 16 2016-01-06 05:47:16

This does not work for me. – Loenix 12 ноя. 16 2016-11-12 16:46:15

@SerG In some cases it didn’t work for me also, but when I removed "matches newline" it did 🙂 – Davidenko 19 дек. 16 2016-12-19 08:50:38

Grt regex. Thanks. – Awanish Kumar 23 янв. 17 2017-01-23 07:45:33

Thanks its awesome – Vinit Kadkol 25 май. 17 2017-05-25 05:54:37

Search for the Regular Expression: (w+)([wW]*)1

Replace it with: $1$2

Hit Replace Button Until there is no more Matches for the Regular Expression in your file.

Создан 22 май. 14 2014-05-22 13:05:47 Hesham Eraqi

Created a test file to try this, but the regular expression did not work reliably to get the job done. – RockPaperLizard 20 мар. 16 2016-03-20 05:29:23

None worked for me.

Created a test file to try this, but the regular expression did not work reliably to get the job done. – RockPaperLizard 20 мар. 16 2016-03-20 05:28:30

For all my data, it worked fine.I forgot what my solution was. Add more details where it failed so that other people might improve this regex. – Manohar Reddy Poreddy 20 мар. 16 2016-03-20 05:46:58

I created a file so each line had a integer between 0-999 on it, in random order, sometimes with duplicates. It didn’t remove most of the duplicates, and didn’t remove any duplicates there were not sequential. – RockPaperLizard 20 мар. 16 2016-03-20 05:58:27

Please do provide 2 examples for working and for not-working ones. It will help someone. – Manohar Reddy Poreddy 20 мар. 16 2016-03-20 09:01:35

the only one that worked for me (npp 7.3). thanks 🙂 – Sickboy 22 ноя. 17 2017-11-22 15:49:21

@Sickboy great! – Manohar Reddy Poreddy 23 ноя. 17 2017-11-23 03:57:12

Notepad++

Ensure that in Search Mode

you have selected Regular expression radio button

Find what:

Replace with:

before:

and we think there

Is it possible to

Is it possible to

after:

Is it possible to

Создан 20 май. 17 2017-05-20 22:21:30 blueberry0xff

Plugin manager is currently unavailable(does not come with the distribution) for Notepad++ , you must install it manually ( https://github.com/bruderstein/nppPluginManager/releases ) and even if you do, a lot of the plugins are not available anymore (no TextFX) plugin.

Maybe there is another plugin which contains the required functionality. Other than that the only way to do it in NotePad++ is to use some special regex for matching and then replacing (CTRL+F -> Replace tab).

Although there are many functionalities available via Edit menu item (trimming, removing empty lines, sorting, converting EOL) there is no "unique" operation available.

I you have Windows 10 then you can enable Bash (just type Ubuntu in Microsoft Store and follow the instructions in the Description to install it) and use cat your_file.txt | sort | uniq > your_file_edited.txt . Of course you must be in the same working directory as "your_file.txt" or refer to it via it’s path.

Создан 23 фев. 18 2018-02-23 14:42:34 Patronaut

Ссылка на основную публикацию
Троттлинг процессора что это
Простой компьютерный блог для души) Всем привет. Сегодня мы затронем тему процессоров, а если быть точнее, то такое явление как...
Схема indesit wisl 83
Инструкции и файлы Файл Страниц Формат Размер Действие 12 pdf 250.49KB Чтобы ознакомиться с инструкцией выберите файл в списке, который...
Схема блока питания для шуруповерта 12 вольт
Аккумуляторный шуруповерт – удобный и необходимый в хозяйстве инструмент. При эксплуатации «от случая к случаю», он может верой и правдой...
Троянские программы и хакерские утилиты
В данную категорию входят программы, осуществляющие различные несанкционированные пользователем действия: сбор информации и ее передачу злоумышленнику, ее разрушение или злонамеренную...
Adblock detector