About cohorts

The SISu project includes aggregate genotype data from population based epidemiological collections and from disease collections. Access to sequence and phenotype data are managed by The National Institute for Health and Welfare's (THL) biobank or custodians of the individual cohorts according to their data access protocols and regulations. The numbers of samples from different cohorts and their cohort specific custodians are listed in the table below. On this page you can also find detailed information about the different cohorts as well as links to their home pages. 

1000 Genomes41Richard Durbin
Adgen391Mikko Hiltunen
Botnia168THL Biobank
Eufam36Marja-Riitta Taskinen
FINRISK and Dilgom6069THL Biobank
Fusion831Michael Boehnke
Health 2000227THL Biobank
IBD579Martti Färkkilä
Metsim823Markku Laakso
Migraine348Aarno Palotie
Oulu Dyslipidemia Families22Sanger Institute
Schizophrenia controls697PMID: 24463508
Twins258Jaakko Kaprio

1000 Genomes

The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies. The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide. Amongst the individuals being sequenced there are also DNA samples from 100 Finnish volunteers

Read more from their official website at www.1000genomes.org/home.


The ADGEN cohort has been collected for a study focusing on the identification of novel Alzheimer’s disease (AD)-associated genes and pathways using existing clinical cohorts from Eastern and Northern Finland. The cohort consists of ~1700 patients fulfilling the NINCDS-ADRDA criteria for probable Alzheimer’s disease and ~1700 cognitively healthy controls from Eastern and Northern Finland. Approximately 700 Alzheimer’s disease patients from the ADGEN study have been included to a whole-exome sequencing project and 391 of these are included in the current SISu release.


The aims of Botnia cohort has been collected from the western coast of Finland in the Gulf of Bothnia for four different studies studying type 2 diabetes. The Botnia Study, started in 1990,  is one of the largest diabetes family studies in the world. The initial family based Botnia study comprised of 11000 individuals as well as a prospective 10-year follow-up of 2800 individuals. The Botnia study also includes a population based study of 5200 individuals aged 18-75 with an ongoing 6-year follow-up study. A project aiming to cover all diabetic patients in the region has also been launched and includes at the moment more than 4000 individuals. The study includes individuals from about 4000 families (about 1000 independent trios) and extensive phenotype information is available for all study participants. The current SISu release includes 168 samples from this study.

Read more from their official website at http://www.botnia-study.org/en/home/.


The DILGOM follow-up study, started in 2014, is a subsample of The National FINRISK 2007 cohort. The name DILGOM comes from the name DIetary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome. The Study is a population study on how nutrition, diet, lifestyle, psychosocial factors, environment and genetics are linked to obesity and the metabolic syndrome. The donor samples were collected to the THL Biobank at the research sites of Helsinki, Vantaa, Turku and the Loimaa area.

Read more from their official website at www.thl.fi/en/web/thlfi-en/topics/information-packages/thl-biobank/sample-collections-in-thl-biobank/dilgom-follow-up-2014-survey.


EUFAM (European Study of Familial Dyslipidemias) study is a project aiming to reveal the molecular and genetic basis of familial combined hyperlipidemia (FCHL) and of familial low high-density cholesterol (HDL-C). The study cohort comprises of over 1500 family members from 140 Finnish families with premature coronary heart disease and with either FCHL or familial low HDL-C. The EUFAM study aims to characterize detailed geno- and phenotypes of these familial dyslipidemias utilizing current high-throughput technology of both genetics and molecular biology. 36 samples of this study are included in the SISu dataset. 


The FINRISK cohorts comprise the respondents of representative, cross-sectional population surveys that are carried out every 5 years since 1972, to assess the risk factors of chronic diseases (e.g. CVD, diabetes, obesity, cancer) and health behavior in the working age population, in 3-5 large study areas of Finland. DNA samples have been collected in the following survey years: 1987, 1992, 1997, 2002, 2007, and 2012. The cohort sizes are 6000-8800 per survey and SISu includes 6069 of these samples.

Read more from their official website at https://thl.fi/en/web/thlfi-en/research-and-expertwork/population-studies/the-national-finrisk-study


The Finland-United States Investigation of NIDDM Genetics (FUSION) dataset is collected for localizing and identifying genetic variants that predispose to type 2 diabetes mellitus (T2D) or are responsible for variability in diabetes-related quantitative traits. The FUSION study sample includes approximately 800 families ascertained for sibling pairs affected with type 2 diabetes, including also parents, unaffected siblings, spouses and children in some cases; ~200 unrelated individuals with normal glucose tolerance at ages 65 and 70 years, with their spouses and children in some cases; and ~8400 mostly unrelated individuals including ~1700 type 2 diabetics selected from the D2D 2004, Finrisk 1987, Finrisk 2002, Health 2000, Action LADA, and Savitaipale Diabetes studies. 831 of these samples (not included in Finrisk 1987, Finrisk 2002, or Health 2000) are currently part of SISu. 

Read more from their official website at https://fusion.sph.umich.edu/

Health 2000

Health 2000 Survey, a comprehensive combination of health interview and health examination survey, was carried out in 2000-2001. The study was based on a nationally representative sample of 8028 persons aged 30 and over living in the mainland Finland. In addition a sample of 1894 persons aged 18-29 and a sample of 1260 survivors from the Mini-Finland Health Examination Survey, were included in the data. The Mini-Finland Health Examination Survey, which also was representative of the Finnish population, was carried out in 1978-1980 by The Social Insurance Institution. The main aim of the Health 2000 Survey was to obtain information on the most important public health problems in working-aged and the aged population, their causes and treatment as well as on the population’s functional capacity and working capacity. For 2130 individuals, selected as case-control study for metabolic syndrome (GenMets study), genome-wide SNP data is available. SISu contains 227 samples in total from this cohort. 

Read more from their official website at https://thl.fi/en/web/thlfi-en/research-and-expertwork/projects-and-programmes/health-2000-2011.


The cross-sectional METSIM (METabolic Syndrome In Men) Study includes 10,197 men, aged from 45 to 73 years, randomly selected from the population register of the Kuopio town, Eastern Finland, and viagra pour femme en france examined in 2005-2010. The aim of the study was to investigate genetic and non-genetic factors associated with the risk of type 2 diabetes (T2D), cardiovascular disease (CVD), and insulin resistance –related traits in a cross-sectional and longitudinal setting. The SISu dataset includes 823 samples from this study. 


The familial occurrence of inflammatory bowel disease (IBD) and the clinical features of familial and sporadic IBD in the genetically homogeneous Finnish population were evaluated. 257 patients with Crohn disease (CD) and 436 with ulcerative colitis (UC) participated in the study.

Read more from article dx.doi.org/10.1080/00365520212511


The Migraine Family Study sample consists of migraine patients visiting headache clinics, from which extensive questionnaire data for headache and co-morbid disorders has been collected. The study combines the best possible phenotyping in large samples with cutting edge genetic techniques, including high throughput genotyping and novel tools of statistical genetics. The aim is to identify gene variants and assess their impact in ascertained family and case control samples and subsequently study their population relevance in large international population samples. 348 samples of this study are included in the SISu dataset.

Read more from their official website at https://www.fimm.fi/en/press-release/1466436294.

Oulu Dyslipidemia Families

These samples include exome sequences of family members with dyslipidemias from northern Finnish origin. The SISu dataset contains 22 samples from this cohort. 

Read more from their official website at www.ega-archive.org/studies/EGAS00001000384


The Finnish Twin Cohort was first established in 1974 to investigate genetic and environmental risk factors for chronic disorders. Twins and their families have been ascertained in three stages from the Central Population Register in 1974 (older like-sexed pairs), 1987 (multiple births 1968-1987) and 1995 (opposite-sex pairs 1938-1957). There are a total of 12,966 MZ and DZ twin pairs (25,932 individuals) with both members currently alive and excluding individuals who refused to participate in studies. Over 15000 DNA samples have been collected in this study, and serum and other biological samples are available from several sub-studies as well. SISu includes 258 samples from this cohort. 

Read more from their official website at https://wiki.helsinki.fi/display/twineng/Twinstudy.


Finnish controls from a Swedish schizophrenia study

The study sought to identify the alleles, genes or gene networks that harbor rare coding variants of moderate or large effect on risk for schizophrenia by exome-sequencing 5,079 individuals, selected from a Swedish sample of more than 11,000 individuals. Controls that were Finnish origin are included in the SISu database.

Read more from article www.ncbi.nlm.nih.gov/pubmed/?term=24463508