資料分析之peakfinder

ChIPSeq Peak Finder

資料分析之peakfinder

程式下載地址

總體而言,因為程式都是一堆 python 指令碼,寫的很分散,所以感覺用

起來不是很好用,所以現在開始測試這個程式。

Peak finder 解壓,數了數,一共有17 * 檔案,也沒作什麼合併

所以幾天都沒有跑起來

I.程式文件的基本解讀

1.

You will want to first convert Solexa output for the chip

and the control sample into bed files using one of the

following scripts:

覆蓋 Solexa 輸出到 chip, 使用這兩個指令碼控制 示例 到 基準檔案

2.

The following scripts are used to read the output from the

0.3 version of ELAND run with the --multi option:

下面的指令碼用於讀 ELAND 0.3版本的輸出, 使用 --multi 選項

3.

You can also create a bed-formatted WIG file, for display

The following scripts are used to read the output from the

0.3 version of ELAND run with the --multi option:

你也能建立一個 基準 WIG 檔案,以上的指令碼用於讀 ELAND 0.3 版本

的輸出, 使用 --multi 選項

4.

You will want to first convert Solexa output for the chip

and the control sample into bed files using one of the

following scripts:

Chip 到 Solexa 輸出的轉換,控制 示例 到 基準檔案.

5.

on the UCSC browser:

USCE 瀏覽器, 這個指令碼什麼作用?

6.

The main script actually implements the peak finder:

peak finder 實際執行的'主指令碼

7.

You will want to first convert Solexa output for the chip

and the control sample into bed files using one of the

following scripts:

on the UCSC browser:

檔案轉換 和 示例 矯正 到 基準,作者推薦使用第一個指令碼

8.

NEW FEATURE of : as of version 2.0, you can

/ should use the -normalize option to calculate

everything as Reads Per Million (RPM). While we have

kept the original behavior as default, we will switch

-normalize to be the default in the next release.

指令碼的新特徵: version 2.0 可以使用-normalize

選項計算每個RPM(Reads Per Million). 我們預設保持原樣,下

一個版本將會開啟 -normalize

The philosophy of this peak finder is to define regions,

and then search for the motif. However, the findall

script can report the actual peaks in the region with

the -listpeak option.

peak finder 的哲學是定義區域, 搜尋模體。儘管這樣, findall

指令碼報告實際的峰的區域,選項, -listpeak

9.

The rest of the analysis depends heavily on Cistematic

to run. The following scripts find associated genes and

anlyze their GO ontology enrichment, if any:

基於 Cistematic 的其餘分析,關聯 基因 和 GO 富集

10.

The following scripts, also requiring Cistematic,

the sequence in the enriched regions, find motifs using

Meme and map motif sites in regions around the peaks:

其餘指令碼, 也要求 Cistematic, 恢復富集區域的序列,使用

MEME 尋找模體,比對peak附近的模體區域

11.

The output of and input of

are motifs in the Cistematic format. A modified

version of to output NRSEs that uses

multiple motifs is:

的輸出 以及 的輸入均是 Cistematic 格式

一個修飾的版本是 到 NRSEs 使用 多個 模體。

12.

The remaining scripts are just helper scripts to allow

comparison between runs and/or move data into UCSC format.

剩餘的指令碼是一些幫助指令碼,幫助比較執行或轉換資料到UCSC格式

II. 程式測試例項.