This manuscript was published in Nature Genetics in July 2017. Click here to download a PDF of the manuscript at Nature Genetics.
Raw sequencing data for this paper were deposited with NCBI GEO under accession number GSE85741.
The analysis code is available on GitHub: Shao_NG_2017
We have also made available a virtual server image with all the software packages, analysis code and raw data used in the paper. We documented our setup instructions here. This virtual server is hosted by Amazon Web Services (AWS) as an Amazon Machine Image (AMI). Please see the AWS web page for information about this service. Specific instructions regarding our AMI are below.
AMI ID: ami-efa1d5f9 (requires an Amazon Web Services account)
This AMI was built from the official Ubuntu 16.04 image (ami-cf68e0d8).
Use the AMI link above after logging in to your AWS account to launch an instance based on this AMI. If you plan to re-run the analysis, we recommend using an instance with at least 30GB of memory. Please note you will be charged by Amazon for both the instance running time and the instance storage, which is 250 GB due to the large amount of included raw and processed data files.
As the AMI runs Ubuntu 16.04, you should login via SSH using the "ubuntu" user and the SSH private key you supplied during instance creation.
All analysis code is in /data/analysis_code/. This directory is a clone of our GitHub Shao_NG_2017 repository. You might consider performing a "git pull" from this directory to update the code to the latest version available on GitHub.
Raw and processed data files can be found in various directories inside /data.
Please feel free to contact us if you have any questions or comments about the AMI or analysis code.