Analyzing Sex-Biased Gene Expression in Autoimmune Diseases

Table: MED1224
Experimentation location: School, Home
Regulated Research (Form 1c): No
Project continuation (Form 7): No

Display board image not available


In this project, I plan to analyze sex-biased gene expression in autoimmune diseases by using a dataset containing information about people’s cell counts. Autoimmune diseases happen when your immune system starts to attack its own healthy cells. No exact cause has been pinpointed, but some suspected causes are sex, race, genetics, and environmental factors. In terms of sex, autoimmune diseases are more prevalent in women than men. In all autoimmune disease cases, women make up 75% while men only make up 25%. Scientists have thought that this disparity could be due to hormonal factors. As we know, women have constantly fluctuating hormone levels, and this has been connected to autoimmune diseases. One study was performed to evaluate the effects of changing prolactin levels, a hormone that contributes to the production of milk in mammals. The study found mice with a prolactin-inhibitor had longer longevity and produced more antibodies that detect systemic lupus erythematosus (SLE) , which is an autoimmune disease. On the other hand, mice with glands that produce more prolactin had accelerated mortality and proteins in their urine, which is a key symptom of SLE. For this project, I used R and RStudio, which is a programming language that allows me to analyze vast amounts of data. The database I used is called DICE which contains information about the donor’s sex, race, ethnicity, and the count of various immune cells per 1 million transcripts. The data collection was done using RNA-Seq, which is a sequencing technique used to quantify RNA in a sample. In RStudio, the code I implemented followed a series of steps to build to a conclusion. To begin with, I eliminated data columns that aren’t needed, after which I filtered the dataset into one with females and one with males. Next, I calculated the mean of each cell type for each divided dataset. Finally, I noted the differences in sexes by subtracting the male average from the female average for each cell type and calculating the absolute value of that difference. After this analysis, I found NK cells and Naive CD4+ T cells have the largest differences, each of which have been found to be abnormal in count or quality in people with autoimmune diseases. For future direction, I plan to narrow down on specific genes that contribute to the sex-disparity in autoimmune diseases.


No additional citations

Additional Project Information

Research paper:
Additional Resources: -- No resources provided --
Project files:
Project files

Research Plan:

First, I will find a database that includes information about gene expression and counts of immune cells. Next, I will analyze the data to find an average of each cell type in females and males. Then I'll find the difference between the averages, and analyze those differences. 

Questions and Answers

1. What was the major objective of your project and what was your plan to achieve it? 

The major objective of my project was to analyze gene expression in females and males to investigate what leads to autoimmune diseases.

       a. Was that goal the result of any specific situation, experience, or problem you encountered?  


       b. Were you trying to solve a problem, answer a question, or test a hypothesis?

I was trying to answer a question. 


2. What were the major tasks you had to perform in order to complete your project?

I had to use RStudio to analyze the large amount of data. 

       a. For teams, describe what each member worked on.


3. What is new or novel about your project?

I use data science to analyze gene expression data, instead of experimental procedures. 

       a. Is there some aspect of your project's objective, or how you achieved it that you haven't done before?

I have never done data science or analyzed large amounts of data before.

       b. Is your project's objective, or the way you implemented it, different from anything you have seen?


       c. If you believe your work to be unique in some way, what research have you done to confirm that it is?

My research has shown that all of the information about the sex bias of autoimmune diseases has been proven by using lab procedures. 


4. What was the most challenging part of completing your project?

The most challenging part of completing my project was learning to use RStudio as a beginner. 

      a. What problems did you encounter, and how did you overcome them?

I encountered a lack of knowledge about RStudio, and I overcame this problem by learning more about RStudio through lectures and classes. 

      b. What did you learn from overcoming these problems?

I learned that resources are available to us always and we just have to use them. 


5. If you were going to do this project again, are there any things you would you do differently the next time?

I would go in with a list of genes to test a smaller piece of data. 


6. Did working on this project give you any ideas for other projects? 



7. How did COVID-19 affect the completion of your project?

COVID-19 has not affected the completion of my project.