![]() targets.csvĪ target is defined by filling out a row in PROJECT_DIR/targets/targets.csv, which must have the following columns: A group of "parts-list" files sgRNAs.csv, amplicon_primers.csv, donors.csv, and extra_sequences.csv are used to register named sequences of each csv's corresponding type, and a master csv file PROJECT_DIR/targets/targets.csv defines each target using references to these named sequences. Targets are defined in a set of csv files inside the targets directory. Genomic location is specified by providing the name of genome targeted (which must exist in PROJECT_DIR/indices), the protospacer of the sgRNA that was used for cutting, and the amplicon primers flanking the genomic location of this protospacer that were used to amplify the genomic DNA. Information about targets is stored in a directory called targets inside a project directory. Knock-knock refers to the combination of a genomic location and associated homology donor as a target. The next step is to provide information about which genomic location was targeted for editing and the sequence of the donor that was provided (if any). in another project's directory), a YAML file PROJECT_DIR/index_locations.yaml that lists paths can be provided. Running this command for hg38 will populate PROJECT_DIR/indices/hg38 with the following files:Īlternatively, if reference genomes and indices already exist (e.g. (This can take up to several hours for mammalian-scale genomes.) Where ORGANISM is one of hg38, mm10, or e_coli, and NUM_THREADS is an optional argument that can be provided to use multiple threads for index building. Knock-knock build-indices PROJECT_DIR ORGANISM Knock-knock provides a built-in way to download references and build indices for human (hg38), mouse (mm10), or e. These files are stored in directory called indices inside a project directory. Once you have created a project directory, knock-knock needs to be provided with the reference genome of a targeted organism in fasta format in order to build indices from this reference. Obtaining reference sequences and building indices To install this example data to a user-specified project directory, run Knock-knock is packaged with some small example data sets for testing purposes. Throughout this documentation, PROJECT_DIR will be used as a stand-in for the path to an actual project directory. ![]() The first step in using knock-knock is to create a project directory that will hold all input data, references sequences, and analysis output for a given project.Įvery time knock-knock is run, this directory is given as a command line argument to tell knock-knock which project to analyze. Generate summary tables and visualizations.Process data to generate alignments and classify outcomes.Fill out sample sheets associating each sample with sequencing data files and an editing strategy.Provide information about the genomic locii targeted for editing and HDR donor sequences. ![]() Obtain reference genomes and build alignment indices from them.This tutorial will walk you through the process of using knock-knock to analyze amplicon sequencing data of an editing experiment, broken down into six steps: Outcome-stratified amplicon length distributions Click here for a guided tour, or here for a small live demo. Knock-knock provides a few ways to interactively explore the different types of alignment architectures produced by each experiment. Knock-knock supports Pacbio CCS data for longer (~thousands of nts) amplicons and paired-end Illumina data for shorter (~hundreds of nts) amplicons. Instead, it takes each amplicon sequencing read and attempts to produce a comprehensive set of local alignments between portions of the read and all relevant sequences present in the edited cell.Įach read is then categorized by identifying a parsimonious subset of local alignments that cover the whole read and analyzing the architecture of this set of covering alignments. In order to do this, the strategy used to align sequencing reads makes as few assumptions as possible about how the reads should look. Knock-knock tries to identify the different ways that genome editing experiments can produce unintended shufflings and rearrangments of the intended editing outcome. If installing with pip, non-Python dependencies need to be installed separately:
0 Comments
Leave a Reply. |