Quality data is the foundation of successful research today. This catalogue is designed as a practical guide for academic staff and students to help them select appropriate data sources and efficiently find the necessary datasets (collections of similar or related research data grouped for research purposes) for their studies.
The catalogue covers both local Latvian data sources and international repositories, providing a structured overview of available options. Each data source is supplemented with specific examples and practical advice to facilitate the data search process.
Open data has become an integral part of modern science, making research more transparent, collaborative, and efficient. This is freely available data that researchers and students can obtain, analyze, and use in their projects without legal or financial restrictions.
Data repositories function as digital libraries where this data is professionally stored, organized, and made easily discoverable. They provide not only secure data storage but also long-term accessibility and quality control. Depending on their specialization, repositories may focus on specific scientific disciplines (e.g., medical data) or offer a universal platform for research materials from various fields.
Scientific Publications and Associated Datasets
Scientific publications are a valuable resource not only for gaining new insights but also for discovering quality datasets. Scientific standards increasingly require researchers to publish the data used in their studies, thereby promoting research reproducibility (the ability to repeat a study using the same data and methods) and transparency.
Prestigious scientific journals (e.g., Nature, Science, PLOS ONE) now often require mandatory data availability statements, where authors indicate where and how study data can be accessed. This data may be available through:
- Repositories: links to Zenodo, Figshare, Dryad, or discipline-specific repositories.
- Journal supplementary materials: Excel or CSV files as article appendices.
- Authors: contact information for data requests.
- Institutional portals: university or research center data platforms.
Practical tip: When searching for literature in your research field, pay attention to "Methods", "Data Availability", or "Supplementary Materials" sections - they often contain references to valuable datasets that could be used in your research as well.
Google Dataset Search
A specialized search engine that helps find publicly available datasets across the internet. This tool is especially useful as it allows searching for data by topic, format, or source. It searches for datasets in various repositories and institutional websites online. Google Dataset Search is an ideal starting point for any data search.
Practical tip: Start with broad, general terms in English (e.g., "education statistics" or "climate data"), then specify with additional filters. Also use geographic terms - "Latvia," "Europe" - to find regionally specific data.
https://datasetsearch.research.google.com/
Data Repositories
Data repositories are specially designed platforms where datasets are stored and shared. Repositories offer much more than simple file storage - they provide metadata management, version control, citability with DOI numbers, and integration with other scientific tools.
Repositories fall into two categories:
- General repositories, such as Zenodo and Figshare, accept data from any scientific discipline and offer extensive functionality. Zenodo is a European Union-funded repository particularly popular for its GitHub integration and free DOI assignment. Figshare offers both private and public collections and is known for its user-friendly interface.
- Specialized repositories focus on specific scientific disciplines or data types. Dryad specializes in ecology and evolution data, often collaborating with scientific journals to fulfill data publication requirements. Kaggle is oriented toward machine learning and data science, offering not only datasets but also an active community and competitions. GenBank in biology and ICPSR in social sciences have become industry standards with high quality and documentation criteria.
Quality repositories provide data along with structured metadata that includes technical information about data format and structure, methodological information about data collection processes, and legal information about licenses and usage terms. Metadata is essential for researchers to effectively analyze and reproduce studies.
Access options for datasets can vary significantly. Open access allows immediate data download without registration, usually with Creative Commons licenses. Other repositories may require free registration or even author permission, especially for sensitive or personal data. In some cases, quality or exclusive data may be available for a fee.
Tip: To find discipline-specific repositories, use re3data (Registry of Research Data Repositories) - a global registry covering over 3000 repositories from all scientific disciplines, allowing searches by discipline, filtering by access types, and comparing quality standards. The platform helps find the most suitable repository with the best standards and long-term prospects for both data publication and acquisition.
DataverseLV
DataverseLV is a secure, multidisciplinary research data repository in Latvia, supported by Latvian universities - Latvia University of Life Sciences and Technologies, University of Latvia, Riga Stradins University, and Riga Technical University - and the Higher Education and Science IT Shared Services Center. It was established in 2025 and is based on Harvard University's open-source Dataverse software, which allows open management, sharing, and preservation of research data.
DataverseLV is Latvia's first national scientific data repository, creating opportunities for local researchers to publish their data in accordance with international open science standards. The repository ensures long-term data preservation, assigns DOI numbers for citability, and offers a structured metadata system. It is particularly important in the Latvian context as it allows preserving and making available data about local social, economic, and environmental processes that might not otherwise be found in international repositories.
Datasets in the repository can be searched by keywords or categories. Open datasets are freely available; for others, access must be requested from the respective dataset author(s).
National-level data sources form the foundation of research infrastructure, providing access to reliable and current information necessary for academic research, policy development, and public information. Unlike private sources, government institutions offer official, standardized data that are often the only authoritative sources for specific national-scale processes.
These data sources cover a broad spectrum - from demographic and economic indicators to health statistics and environmental monitoring. The main advantages of government data are high quality, methodological consistency, and broad coverage, although micro data access may be restricted and require special applications.
Central Statistical Bureau
The Central Statistical Bureau (CSB) is a direct administration institution operating under the supervision of the Ministry of Economics of the Republic of Latvia and is the main performer and coordinator of official state statistics preparation work in the country. CSB is responsible for organizing official state statistics provision work in Latvia and for the correctness of data obtained by compiling information received from respondents.
CSB's overarching goal is to provide current statistical information, developing partnerships in using new data sources and methods in statistics production during the strategy implementation period. CSB compiles, processes, and publishes statistics that are essential for both state administration and businesses, researchers, and society.
Dataset examples:
- Macro data: demographics (population by regions, births, deaths), economics (GDP by sectors, inflation, wages), labor market (unemployment rate, employment by sectors), social indicators (poverty risk, quality of life indices).
- Micro data: household budget survey individual records, labor force survey detailed data, enterprise structure survey data.
CSB portal's available macro data is free, and no registration is required to access them. Data can be searched by keywords and categories.
Micro data may only be used for research purposes, and the research results must provide benefits to society. Access to data can be obtained by sending an application to email research@csp.gov.lv. The application must include a completed form (specifying the research project description, justification for why indirectly identifiable data are necessary for the research work, and how confidential data protection will be ensured.
Disease Prevention and Control Centre
The Disease Prevention and Control Centre is a direct administration institution under the Republic of Latvia's Minister of Health, whose operational objectives are to implement the country's public health policy in epidemiological safety and disease prevention sub-areas and health care policy in health care quality sub-areas, and to ensure implementation and coordination of health promotion policy.
The Disease Prevention and Control Centre's health statistics database includes various health and healthcare indicators: population health, maternal and child health, healthcare, mortality, habits affecting population health, healthcare outcomes, and patient safety.
The Disease Prevention and Control Centre portal's available macro data is free, and no registration is required to access them. Data can be searched in the database by categories and keywords.
https://statistika.spkc.gov.lv/pxweb/en/Health/
To access micro data, an application must be completed and submitted to the Disease Prevention and Control Centre SPKC. The application process description is available here (in Latvian).
Latvia Open Data Portal
Latvia's open data portal is an online platform where state and local government institutions' available data are published and made accessible in open format. The portal's creation goal is to promote transparency, public participation, and innovation by providing everyone with access to datasets from various fields - for example, about environment, economy, public transport, health, and education.
The portal operates according to open data principles: data are freely available, can be downloaded, processed, and used for research, business, and public needs. Most data are structured and available in standardized formats (e.g., CSV, JSON), making them easy to use in digital applications and analytical solutions.
Latvia's open data portal is part of European Union open data initiatives and is connected to the European data portal, promoting international data accessibility and cross-border cooperation.
The portal contains data on various categories: foreign affairs, economy, energy, population and society, education and sports, culture, regions and municipalities, justice/interior and security, transport, public administration, health, environment, agriculture, science, and technology.
Data access is free, and no registration is required. Data can be searched by keywords and categories.
International data sources are fundamental in modern research as they provide access to standardized data that allow reliable comparisons between countries and regions, analysis of global trends, and assessment of relative positions in international context. Unlike national sources, international organizations offer standardized data using unified methodologies and definitions.
International data sources also offer long series that allow analysis of development trends over decades, as well as forecasts and analytical materials that supplement statistical data with expert assessments and interpretations.
Eurostat
Eurostat is the European Union's official statistics office, whose main task is to harmonize and publish reliable, comparable statistics for all EU member states. Founded in 1953 as a European Commission structural unit, Eurostat has become one of the most authoritative European data sources, providing both policymakers and researchers with objective statistical information about processes across Europe.
Eurostat's unique value lies in its ability to standardize statistics from different countries using unified definitions, classifications, and methodologies. This means that Latvia's GDP indicator is calculated according to the same principles as Germany's or France's, allowing reliable comparisons. The office covers all areas of social life and regularly publishes both annual statistical collections and operational indicators.
Dataset examples:
- Economics: GDP, inflation, trade balance, government debt by EU countries.
- Demographics: population composition, births, deaths, migration, aging indices.
- Labor market: employment, unemployment.
- Environment: greenhouse gas emissions, renewable energy.
All Eurostat macro data are freely available without registration on the Eurostat website. The site offers both simple searches and complex queries with multiple filters and time dimensions.
The University of Latvia is included in Eurostat's list of recognized scientific institutions, giving UL scientists access to micro data - anonymized records of individual households, enterprises, or persons. To obtain access to micro data, an application with research description and justification must be completed (instructions).
For questions or clarifications, contact the UL Eurostat contact person - Science Department Senior Expert Aija Erta (email: aija.erta@lu.lv, phone: 67034938).
OECD
The OECD (Organization for Economic Co-operation and Development) is an international organization that brings together thirty-eight economically developed countries with democratic governance and market economies. Founded in 1961, OECD's goal is to promote economic growth, improve quality of life, and solve global problems based on evidence-based policy and international cooperation.
OECD's unique value lies in its analytical capacity and ability to combine policy development with quality data collection. The organization not only compiles statistics but also conducts deep comparative analyses, develops international standards, and provides policy recommendations. This makes OECD data particularly valuable for researchers as they are not only statistically accurate but also contextualized with expert analyses and policy recommendations.
The organization is known for its regular reviews of member countries' economies, education system assessments, and innovative indices that have become international benchmark measurements. OECD data often serves as the basis for policy discussions at both national and international levels.
Dataset examples:
- PISA education data: student performance in mathematics, reading, sciences by countries and time periods.
- Tax policy: tax rates, tax revenues.
- Innovation: research and development investments.
- Social indicators: income inequality, poverty risks, social mobility.
The OECD data portal (https://www.oecd.org/en/data.html) offers free access to all macro data without registration requirements. The portal provides both simple searches and sophisticated analytical tools with the ability to create customized charts and tables. In addition to statistics, OECD also publishes methodological descriptions and analytical reports that help interpret data.
Practical tip: OECD data are particularly valuable for comparative studies as the organization regularly analyzes Latvia's performance relative to other member countries and provides specific improvement recommendations.
World Bank
The World Bank is an international organization founded in 1944 with the goal of helping countries develop, reduce poverty, and promote sustainable growth. Originally created for European reconstruction after World War II, the organization today operates as a global financial and consultative partner, particularly focusing on developing and middle-income countries.
The World Bank's unique position in global development financing enables it to compile and publish one of the broadest collections of international development data in the world. The organization not only provides loans and technical assistance but also systematically collects data on economic, social, and environmental development in all world countries. This data is particularly valuable as it covers both developed and developing countries, allowing analysis of global development patterns and comparison of different regional experiences.
World Bank data collection is based on cooperation with national statistical offices, international organizations, and research institutions. The organization not only compiles existing data but also develops new indicators and methodologies to better measure development processes. Particularly significant are the World Bank's poverty measurements and quality of life indicators, which have become international standards.
Dataset examples:
- Development indicators: GDP per capita, poverty level.
- Education: literacy rate, school attendance by levels.
- Health: child and maternal mortality, life expectancy, vaccination rates.
- Infrastructure: electricity access, safe water, sanitation services, internet penetration.
- Climate and environment: CO₂ emissions, forest cover.
The World Bank offers free access to all data without registration requirements. The portal provides both simple search tools and sophisticated analytical instruments with the ability to create customized charts, maps, and comparisons. Data are regularly updated and cover long series - in some cases back to the mid-19th century, allowing analysis of long-term development trends.
Practical tip: World Bank data are particularly useful for studies on global trends, developing country experiences, and Latvia's position in international context, as the organization provides a comprehensive view of development processes.
European Data Portal
The European Data Portal is the European Union's main open data platform that combines and makes available public sector data. The portal's goal is not only to provide access to data but also to promote their reuse to create economic and social value.
The portal contains data from all EU member states as well as other European countries, covering a wide range of topics - from transport and the environment to health, education, and economy. Information is available in various languages, making it usable not only for experts but also for the public.
In addition to the data itself, the portal offers training materials and guidelines on how to use this data effectively. Since 2021, the portal has been known as data.europa.eu and is the main EU source for searching open data.
Data on the European Data Portal are available without registration and free of charge. They can be searched by keywords and categories.
Practical tip: The European Data Portal is particularly valuable for research on EU policy impact, regional differences in Europe, and international comparisons, as it covers both Latvian and other EU countries' data in one place.
Data search and acquisition can seem like a complex task, especially considering the enormous diversity of available sources and various access conditions. These practical recommendations are designed to help researchers and students systematically and efficiently navigate the data search process. Regardless of what type of research work you are searching for data; these tips will help you find quality and reliable data sources.
How to Search for Data
Successful data search begins with a clear research question definition. Before searching, precisely formulate what you want to learn, what type of data is needed (quantitative, qualitative, time series), what is the study scope (local, national, international), and what period interests you.
Develop a search term list in Latvian for national sources and in English for international repositories. Include synonyms and discipline-specific terms - for example, when searching for unemployment data, use both "unemployment" and "jobless rate."
Start with Google Dataset Search as a universal starting point. Use short, general terms in English, even when searching for Latvian data. Review search results and identify the most promising data sources for further verification. Then search in specific sources: academic data – Zenodo, Figshare, DataverseLV; Latvian statistics - CSB, Disease Prevention and Control Centre, Latvia Open Data Portal; international data - Eurostat, OECD, World Bank.
Do not forget about scientific publications - they contain references to valuable datasets, especially in methodology sections and supplementary materials. Choose sources depending on data type: search for current statistics in CSB, Disease Prevention and Control Centre, and Eurostat, but use World Bank and OECD data for historical time series.
Keep in mind that each scientific discipline has its main data sources. In social sciences, it starts with CSB and Eurostat social indicator data. In economics, use CSB, Eurostat, and OECD, supplement with World Bank data. In health, search Disease Prevention and Control Centre and Eurostat health statistics. For environmental studies, use Latvia Open Data Portal and Eurostat environmental data. For education research, focus on CSB, Eurostat, and OECD PISA data. Search for research data and publications in, for example, Zenodo and Figshare.
How to Avoid Errors in Data Search
Avoid typical mistakes: do not search only in Latvian in international sources, do not compare only one source, and do not download data without reading documentation. Always use English with "Latvia" or "Latvian," compare multiple sources, and familiarize yourself with the data collection process.
If you cannot find the necessary data, expand search terms with synonyms or related fields. Search for similar studies and analyze what data they used. Consider combining data from multiple sources or contact field experts.
How to Assess Data Quality
- Assess data source reliability - government statistical offices (CSB, Eurostat) offer the highest quality, international organizations (OECD, World Bank) provide standardized data, academic repositories contain peer-reviewed materials.
- Check how data were collected, sample size, representativeness, and update frequency.
- Assess data completeness - whether there are many missing values, whether it covers the needed period, whether definitions are consistent.
- Read documentation - always familiarize yourself with data descriptions and methodology to understand data limitations and usage conditions.
How to Obtain and Document Data
Download freely available data immediately - sometimes organizations change access conditions, formats, or even remove data from websites. For restricted access data, an application with a research description must be submitted - expect that such application processing may take several weeks.
Always document each dataset - source name, link, download date, version, and license conditions.