
Bioinformatics Services
Array data must be analyzed with care to best identify differences between samples, but not everyone has the expertise or resources to properly process array data. Let our in-house biostatistics and bioinformatics experts analyze your data for you.
We can perform custom analyses for both genomics and proteomics data. For RayBiotech multiplex arrays, we offer á la carte options. We can also assist in experimental design. Click on a service below to learn more:
Tell us about your project and the type of analysis you need at [email protected]. You can also call us at 770-720-2992 to quickly speak with a representative to help guide you.
Data Clean-Up
We recommend the data clean-up service for all array data as it forms the foundation for all subsequent biostatistics and bioinformatics analysis. Your final report will contain the protein expression data after data filtration, normalization, transformation, and outlier extraction (Microsoft Excel format) in addition to a detailed description of analysis steps performed (PDF format).

Outlier identification using Principal Component Analysis (PCA)
Differential Expression Analysis
This analysis identifies the proteins that are statistically significant between different groups of samples. This service is especially helpful for biomarker discovery / validation and profiling. The types of statistical analyses include:
Your final report will contain the following information:
- Description of analysis steps performed
- Comparison of statistical tests
- Result summary depicting p-values
- Volcano Map
- Jitter or swarm plot

Jitter plot comparing sample groups for one biomarker candidate. Cross-validation and accuracy and kappa of biomarkers using different statistical tests. Volcano plot comparing sample groups.
Cluster Analysis
Cluster analysis identifies groups of markers with similar and different expression profiles across groups. This service is especially helpful for profiling and sample stratification. The final report includes:
- Description of analysis steps performed
- Hierarchical cluster
- PCA plot

Hierarchical cluster and heatmap of 8 samples where red represents increased expression level and blue represents decreased expression level. Plot of PC1 and PC2 values of 6 samples. Samples subjected to Treatment #2 ("T2") cluster with each other but not with samples subjected to Treatment #1 ("T1"). Samples subjected to T1 do not cluster with each other.
Pathway Analysis
Pathway analysis identifies the specific protein functions, biological pathways, and physical interactions that are enriched in a particular group. The data are obtained from:
- GO (Gene Ontology)
- KEGG (Kyoto Encyclopedia of Genes and Genomes)
- STRING
This service is used for profiling, sample stratification, and biomarker discovery. Your final report will contain the following information:
- Description of analysis steps performed
- Pathway enrichment
- List of enriched pathways and FDR values
- List of the proteins, their known functions and processes, and p-values related to their enrichment
- Enriched biological pathways (see figure)
- Enriched molecular functions
- Enriched biological processes
- Protein interaction mapping
- List of proteins identified in your study that have known interactions with each other
- Protein interaction map (see figure)

KEGG pathway over-representation analysis on differentially-expressed biomarkers (FDR < 0.05). Protein interaction map where proteins are represented as nodes and interactions as edges.
Biomarker Selection
Biomarker selection uses a variety of models to identify a subset of biomarkers that best differentiate the control from test samples. The models that are used in this service include:
- Logistic regression
- Linear discriminant analysis (LDA)
- Support vector machine (SVM)
- Random forest
- Other models may be used
Your final report will contain the following information:
- Description of analysis steps performed
- Predictive modeling using a subset of data
- ROC analysis
Experimental Design
Before you begin an experiment. Don’t find out after you’ve collected your data that you didn’t design your experiment properly! RayBiotech’s team of technical experts will work with you to design your experiment so that you can obtain statistically-powerful array data. We consider such things as:
- Replication
- Randomization
- Blocks
- Sample size
- Estimated biomarker prevalence
- Sensitivity & specificity
What you get:
- One-on-one consultation with a technical expert
- Detailed experimental design report
- Properly designed experiment so that statistically-powerful data can be obtained
Discuss your project requirements
Frequently Asked Questions
Still have questions?
Package Name | What you get | Per sample with ≤ 500 targets | Per sample with > 500 targets** | Catalog # |
Basic | Data clean-up | $5 | Â | BIOSTAT-Pre-A |
 | $8 | BIOSTAT-Pre-B | ||
Any 1* | Any ONE service* Includes data clean-up |
$10 | Â | BIOSTAT-1-A |
 | $16 | BIOSTAT-1-B | ||
Any 2* | Any TWO services* Includes data clean-up |
$16 | Â | BIOSTAT-2-A |
 | $25.60 | BIOSTAT-2-B | ||
Any 3* | Any THREE services* Includes data clean-up |
$21 | Â | BIOSTAT-3-A |
 | $33.60 | BIOSTAT-3-B | ||
The Works | All 4 services* Includes data clean-up |
$24 | Â | BIOSTAT-Works-A |
 | $38.40 | BIOSTAT-Works-B | ||
Experimental Design | Experimental Design | $250 per experiment*** | BIOSTAT-ED |
These prices are for array data using RayBiotech arrays. Minimum of $250 per order.
For pricing information on data collected using other companies' arrays, bulk service orders, custom packages, or custom analyses, please contact us.
* All data must go through pre-processing, which is known as our "Data clean-up" service. This package does NOT include Experimental Design.
** Up to 1500 targets. For more than 1500 targets per array, please contact us for pricing.
*** For example, biomarker discovery and validation are two different experiments.
This depends on the data analysis that we will perform.
If data from RayBiotech arrays will be analyzed with our Biostatistics & Bioinformatics service, the data must be provided in the correct format:
- Analyzed with the Analysis Tool software. The Analysis Tool software is available for free to download on the array's product page. If you have difficulty locating the software, please e-mail [email protected] and provide the array catalog numbers that you used.
- Normalized inter-array data. The Analysis Tool software normalizes data within a slide, but does not perform across-slide normalization. Learn more about normalizing array data on this page under "How do I normalize my array data?"
RayBiotech can provide the correct data format for RayBiotech arrays as part of our array full testing services or array analysis services (additional fees apply).
If other types of data will be analyzed, we will let you know what type of format we will require.
Click on a service below to view an example report:
This step is important for a couple of reasons.
The first reason is that this service puts the data in a format that is required by the other services. In other words, if a dataset does NOT go through "Data clean-up", none of the other services can be performed.
The second reason is that variability within and across arrays is common with protein arrays, thus data normalization and transformation essentially improve the technical reproducibility. Outlier data can skew results. If this service is NOT used, it is highly probable that any potential differences between sample groups are not identified. Alternatively, differences that are detected may not be reliable (i.e., false positives).
We can identify biomarkers by their differential expression; enrichment of pathway, molecular function, biological process, and protein-protein interactions; and feature selection following predictive modeling.
It depends. We appreciate feedback and suggestions, and may implement your suggestions as part of our standard services. Some figures and analyses may be free depending on the selected package, number of samples and workload that it will take. Please contact us to find out.
For packages using only one service, our turnaround time is 5 - 7 business days. For the other packages including > 2 services, our turnaround time is 10 - 12 business days.
Once we receive all of the information that we need from you, our turnaround time is 5 - 7 business days.
We can. The format needs to be in .csv or Excel. The protein symbol or gene symbol must be listed by row. The sample must be listed by column. For these analyses, you will need to contact us for pricing.
We can do analyses on DNA data and the cost is dependent on several factors. We will need to know sample number, data quality, data size, and data type (i.e., how was it analyzed?). You also need to let us know exactly what type of information you’re looking for, such as peak identity, peak annotation, sample comparison, etc. Our turnaround time is usually 2 - 3 weeks. Please contact us to learn more.
We often do, depending on the number of samples that will be analyzed. Please contact us to find out.
There is a required amount of time to set up the analyses per experiment, and it is one of the most labor-intensive steps of the process. Therefore, we need to charge a minimum amount to cover the labor required for setting up the analyses.
Yes, you can. If an Excel format of your report is not provided to you, just ask us for it.
The "Differential expression analysis", "Cluster analysis", and "Pathway analysis" services are interested in finding the statistical or biological relationship of a biomarker with the disease. The "Biomarker selection" service is interested in finding the combination of biomarkers that will classify or predict disease status.
"Data clean-up": 1 sample
"Pathway analysis": 1 sample per group
"Cluster analysis": 1 sample per group
"Differential expression analysis": 3 samples per group
"Biomarker selection": 10 samples per group
Per group refers to a condition to be tested. If healthy versus cancer samples are to be compared, these are two different groups and a minimum of 20 samples would be needed to perform the "Biomarker selection" service.
Low sample numbers for "Differential expression analysis" and "Biomarker selection" services will result in less accurate modeling. Researchers should try to use at least 10 samples per group for the "Differential expression analysis" service and at least 25 samples per group for the "Biomarker selection" service.
For a general, non-math understanding of how many of these analyses work, please click here.
Still have questions?