Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I am trying to compose a grep statement and it is killing me. I am also tired of getting the arguments list too long error. I have a file, let's call it subset.txt. It contains hundreds of lines with specific strings such as MO43312948. In my object directory I have thousands of files and I need to copy all the files that contain the strings listed in subset.txt into another directory.

I was trying to start with this to just return the matching files from the objects directory.

grep -F "$(subset.txt)" /objects/*

I keep getting `bash: /bin/grep: Argument list too long``

I have searched online and found similar discussions, but not similar enough to help me figure this out. Any help would be greatly appreciated on this. I have been trying to figure this out for a week and I need to get some forward momentum on this.

share|improve this question
4  
Why have you put "$(subset.txt)" in the command like that? That is command substitution, which will make your shell execute subset.txt (as if it were a command or script). – JigglyNaga 15 hours ago

You can pass a directory as a target to grep with -R and a file of input patterns with -f:

  -f FILE, --file=FILE
          Obtain patterns from FILE, one per line.  If this option is used
          multiple  times  or  is  combined with the -e (--regexp) option,
          search for all patterns given.  The  empty  file  contains  zero
          patterns, and therefore matches nothing.

   -R, --dereference-recursive
          Read all files under each directory,  recursively.   Follow  all
          symbolic links, unlike -r.

So, you're looking for:

grep -Ff subset.txt -r objects/

You can get the list of matching files with:

grep -Flf subset.txt -r objects/

So, if your final list isn't too long, you can just do:

 mv $(grep -Flf subset.txt -r objects/) new_dir/

If that returns an argument list too long error, use:

grep -Flf subset.txt -r objects/ | xargs -I{} mv {} bar/

And if your file names can contain spaces or other strange characters, use (assuming GNU grep):

grep -FZlf subset.txt -r objects/ | xargs -0I{} mv {} bar/

Finally, if you want to exclude binary files, use:

grep -IFZlf subset.txt -r objects/ | xargs -0I{} mv {} bar/
share|improve this answer

use

grep -F -f subset.txt 

to tell grep to read from subset.txt file.

you may use find to walk the file.

find . -type f -exec grep -F -f subset.txt {} \;

or

find . -type f -exec grep -F -f subset.txt {}  +
share|improve this answer
    
Any advantage of using find instead of -r other than that you do additional filtering? – phk 15 hours ago
    
None I can think of. In fact I didn't Knpw the -r option. – Archemar 12 hours ago
1  
@phk grep -r searches in symlinks to regular files, which may or may not be desirable (if they point inside the same tree, you're searching the same file twice; if they point outside, you're searching a file which may or may not be desired). – Gilles 7 hours ago
    
Modern versions of grep have options to control their interaction with symbolic links (man grep to determine the specifics for the current system). A recursive grep will be a lot faster than running grep individually on every file via find. – Perry 2 hours ago

If you want to speed up grep even more, you can set the locale in your shell before running it, i.e. use "LC_ALL=c". This will be inherited into grep and will disable Unicode processing when not necessary and in some cases can dramatically speed up grep. A great blog documenting this can be found at http://www.inmotionhosting.com/support/website/ssh/speed-up-grep-searches-with-lc-all. This trick also can speed up bash shell scripts as well, not just grep.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.