#!/bin/bash ## This script is used to search for a specific input $sequence in a specific $input_file, and output both the ID and the sequence in an $output_file (if it was found in the input). ## Can run this script multiple times for the same $output_file with different $input_files (for different selection pools) and it will output the results from the new input without overwriting the previous results from other $input_files. ## This script is useful for pools that are very heterogenous (i.e., have a lot of unique sequences), which make FASTAptamer Clust take too long to run properly and result in FASTAptamer Enrich files that are too large to open in typical spreadsheet programs for analysis. ## Grep is a command-line utility standard in many UNIX installations, so no program installations are needed to run this script. ## Grep Searcher Variables sequence=NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN ## Sequence to search for. input_file=/full/path/to/input.fasta ## Fasta File of Sequences to Search Through. output_file=/full/path/to/file.txt ## Output TXT File of any matches for sequence. ## Look-up the given $sequence in the $input_file and print the ID and sequence. grep -B1 $sequence $input_file >> $output_file && echo "^^^^^" $input_file "^^^^^" >> $output_file