You can also download portable version of Cas-OFFinder!
To download and run Cas-OFFinder, please follow below instructions.
STEP 1
Cas-OFFinder requires an OpenCL-enabled device (CPU, GPU, or etc..) and corresponding runtime driver pre-installed to run properly.
Please download and install the latest driver below:
- Intel (Download 'OpenCL runtime' in the middle of the page): http://software.intel.com/en-us/vcsource/tools/opencl-sdk
- NVidia: http://www.nvidia.com/Download/index.aspx
- AMD: http://support.amd.com/en-us/download
Before installing Cas-OFFinder, please check whether your device is an OpenCL-supported one.
Khronos group provides an extensive list of OpenCL supported devices here.
Supported OS
- Linux (with proprietary drivers installed)
- Max OS X (Snow leopard or higher)
- Windows 7 or higher (XP or below is not supported)
STEP 2
Whole genome of target organism is needed (in FASTA format). You can find one in one of the below links:
Extract all FASTA files in a directory. Remember the full path of the FASTA files directory.
STEP 3
Download Cas-OFFinder Binary here.
(Its source code is accessible on Github, or downloadable as zip file. But if you want to just use Cas-OFFinder, you don't need to download source code files)
NOTE: It is reported that Cas-OFFinder sometimes hangs with certain version of AMD driver due to unknown reason. If you experience this issue, then please try older version of Cas-OFFinder, or use Cas-OFFinder on the system with GPUs/accelerators of other vendors. We are currently examining this issue, and we will release revised version of Cas-OFFinder as soon as we fix it.
STEP 4 - Usage
Basic usage of Cas-OFFinder is following:
cas-offinder {input_file_path} {G or C} {output_file_path}
G stands for using all available GPU devices, and C for using all CPUs.
Input file
- First line - path of chromosomes FASTA files
- Second line - desired pattern including PAM sequence
- Third (or more) line - query sequences with maximum mistmatched numbers, seperated by spaces. (The length of the desired pattern and the query sequences should be the same!)
Note that Cas-OFFinder allows mixed bases to account for the degeneracy in PAM sequences and the number of mismatched bases is not limited!
Following codes are supported for the bases:
Code | Base |
---|---|
A | Adenine |
C | Cytosine |
G | Guanine |
T | Thymine |
Code | Base |
---|---|
R | A or G |
Y | C or T |
S | G or C |
W | A or T |
K | G or T |
M | A or C |
Code | Base |
---|---|
B | C or G or T |
D | A or G or T |
H | A or C or T |
V | A or C or G |
N | any base |
An example of input file:
/var/chromosomes/human_hg19
NNNNNNNNNNNNNNNNNNNNNRG
GGCCGACCTGTCGCTGACGCNNN 5
CGCCAGCGTCAGCGACAGGTNNN 5
ACGGCGCCAGCGTCAGCGACNNN 5
GTCGCTGACGCTGGCGCCGTNNN 5
...
Now you can run Cas-OFFinder as following:
cas-offinder input.txt G out.txt
If you get an error message "command not found", try adding a ./ before the cas-offinder.
Output file
- First column - given query sequence
- Second column - FASTA sequence title (if you downloaded it from UCSC or Ensembl, it is usually a chromosome name)
- Third column - position of the potential off-target site (same convention with Bowtie)
- Forth column - actual sequence located at the position (mismatched bases noted in lowercase letters)
- Fifth column - indicates forward strand(+) or reverse strand(-) of the found sequence
- Last column - the number of the mismatched bases ('N' in PAM sequence are not counted as mismatched bases)
An example of output file:
GGCCGACCTGTCGCTGACGCNNN chr8 49679 GGgCatCCTGTCGCaGACaCAGG + 5
GGCCGACCTGTCGCTGACGCNNN chr8 517739 GcCCtgCaTGTgGCTGACGCAGG + 5
GGCCGACCTGTCGCTGACGCNNN chr8 599935 tGCCGtCtTcTCcCTGACGCCAG - 5
GGCCGACCTGTCGCTGACGCNNN chr8 5308348 GGCaGgCCTGgCttTGACGCAGG - 5
GGCCGACCTGTCGCTGACGCNNN chr8 9525579 GGCCcAgCTGTtGCTGAtGaAAG + 5
GGCCGACCTGTCGCTGACGCNNN chr8 12657177 GGCCcACCTGTgGCTGcCcaTAG - 5
GGCCGACCTGTCGCTGACGCNNN chr8 12808911 GGCCGACCaGgtGCTccCGCCGG + 5
GGCCGACCTGTCGCTGACGCNNN chr8 21351922 GGCCcACCTGaCtCTGAgGaCAG - 5
GGCCGACCTGTCGCTGACGCNNN chr8 21965064 GGCCGtCCTGcgGCTGctGCAGG - 5
GGCCGACCTGTCGCTGACGCNNN chr8 22409058 GcCCGACCccTCcCcGACGCCAG + 5
...