26 responses to “Is this Open Data?”

  1. Natasha Malik

    https://opentrials.net/

    I found a data repository called “OpenTrials” which aims to “locate, match, and share all publicly accessible data and documents, on all trials conducted, on all medicines and other treatments, globally”. I think this is Open Data because according to their paper on the project, they explain that their “intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine.”

  2. Jennifer Ma

    https://www2.gov.bc.ca/gov/content/data/statistics/business-industry-trade/small-business
    This is a link to a data table of small business counts and employment from BC Statistics Canada. From my view, it is Open Data because it meets FAIR principle. The data is accessible from the public, and can be redistributed under provided term, and also it doesn’t appear to restrict to public use.

  3. Ciara Zogheib

    https://core.tdar.org/dataset/372427/whitehall-catalog This is a link to the option to download archaeological data from the excavations of the Whitehall rock shelter. This data is open because it is available and accessible in a convenient form (Excel spreadsheet), it is open for reuse and redistribution, and universal participation is allowed (a tDAR account is needed, but anyone can make one).

  4. Alyssandra Maglanque

    While looking through the data repositories listed in the OAD, I saw that Microsoft Research Open Data was linked. I was curious to see how a megacorporation like Microsoft followed Open Data and FAIR principles, and have found that while the data that Microsoft researchers collect does seem to be open for people to use, there are also barriers such as requiring a program (Azure and its various tools) which needs a subscription to be paid for long term use of the program in order to download an entire dataset. Not only that, but there is a limited amount of time to download the dataset once Microsoft issues you a time-limited key. This greatly reduces access to the datasets. Taking a look at the Microsoft Research Open Data License Terms also reveals that it fails to meet the “Universal Participation” requirement for Open Data, as Microsoft only allows their datasets to be used for non-commercial purposes. In conclusion, I don’t believe that the Microsoft Research Open Data directory is truly Open Data.

  5. Ian Harmon

    I examined a dataset in Open ICPSR called “National Neighborhood Data Archive (NaNDA): Arts, Entertainment, and Recreation Organizations by Census Tract, United States, 2003-2017,” available at: https://doi.org/10.3886/E115543V2.

    The dataset consists of four files: a PDF, a .sas7bdat file (which I think is opened with SAS), a .dta file (Stata), and a .zip file that appears to contain a csv file and a readme. I tried to download the zip file to look at it more closely, but I was taken to a login screen for “regular” (i.e. not “open”) ICPSR. I found this puzzling, and didn’t want to create a new account to see if I could then access this data. The record for the dataset in Open ICPSR does have a CC-BY license.

    Setting aside the issues of obtaining the files, I would say the dataset is generally open. The CC-BY license allows for reuse and redistribution, and there is no barrier to universal participation. While some of the files are in proprietary or unstructured formats, the inclusion of a csv file does allow for easier use of the data.

  6. Michelle

    Here are some interactive government datasets that focus on spatial data. I would say the data these are based on are semi-open. The data is available (including the original shape-files, etc to make the maps) – but they are fairly difficult to find through a series of different government websites (and not necessarily linked to from the interactive maps, which are much more easy to find. The second map is a bit better, in that the links to the raw data is a little bit more accessible – but this kind of spatial data often requires specific software to be able to modify and use.
    People are allowed to re-use the data.

    http://geobosques.minam.gob.pe/geobosque/visor/#
    http://geo.serfor.gob.pe/geoserfor/

  7. David Gill

    I looked at the Marine Geoscience data system https://www.marine-geo.org/index.php. The data found in this website is not open because the data and metadata is licenced under CC BY-NC-SA 3.0 US. This licence means the data can only be used for noncommerical purposes. https://www.marine-geo.org/about/terms_of_use.php

  8. Neah Ingram-Monteiro

    I found a dataset through Databrary, which is a repository for video data. I used the option to find a dataset that “Contains data released for public use.” I looked at a few datasets and most were limited to a few highlights rather than a complete dataset, but this could be because the release level is set to “public.” The Databrary repository does make data available in .csv format. Looking at this repository brought up the question for me of whether .mp4 format is considered interoperable; according to the UK Data Service cited at https://www.openaire.eu/data-formats-for-preservation, it is one of the preferred formats for preserving video files.

    On one dataset that I focused on (https://nyu.databrary.org/volume/34), there was no description of the remaining data, and the “description of dataset” was a few sentences that described the study rather than the data.

    I would not consider this particular dataset to be open, as it is not available as a whole, doesn’t follow FAIR principles, and is not clearly provided under an open license.

  9. Bart McLeroy

    I looked for data that is mandated to be disclosed by mortgage lenders in the US under the Home Mortgage Disclosure Act. This is held by the Consumer Financial Protection Bureau and is located here: https://www.consumerfinance.gov/data-research/hmda/historic-data/

    No login or other credentials were required to obtain the information. The data is downloadable as a CSV file, so I would view it as being Open Data.

  10. Erin Calhoun

    I selected a data set from the City of Toronto’s Open Data Portal, due to being interested in how “open” data was from a public institution. I selected the “Outbreaks in Toronto Healthcare” dataset. The data comes with a “readme” file and is available for download in an xls format. No login credentials are required to view the data. There is also code snippets for (although I am not 100% sure what that is used for/refers to) Python, R, and Node.js, for viewing the data in an API.
    The data is available as a whole (broken down by year) and permitted for re-use and redistribution. Based on these observations, I believe that the data is open in this case.

  11. Amanda Yang

    I selected a dataset from Microsoft Open Research Data called “Power Grading Short Answer Corpus.” The link is here: https://msropendata.com/datasets/c2400e0d-f71e-4b10-9fbd-696f8ffcb0ee

    The license is under Open Use of Data Agreement v1.0 which adheres to the 3 principles of open data. The license states you may use, modify, and distribute the data with no restrictions on use, distribution or results with appropriate credit or attribution. The files are also in .tsv, .rtf, and .txt which are machine-readable.

    However, I am unable to download the files,but only able to view them which is not open data. It says that I must login in order to export this data set which is also not a sign of open data. Overall, while the license suggests open data privileges, the access to the dataset reminder this dataset not open data.

  12. Crystal Wu

    I selected the data set “Modal verbs of strong obligation in Scottish Standard English: Corpus data” [https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/GIP0PM] from the TROLLING data repository which is said to be an open access archive. This specific dataset is available for download without log-in restrictions. That said, this dataset is protected under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license which states that the material cannot be used for commercial purposes which is not compliant with the definition of open data.

  13. Hannah Tanna

    I selected the dataset for “Status of COVID-19 cases in Ontario by Public Health Unit (PHU)” from https://data.ontario.ca/en/dataset/status-of-covid-19-cases-in-ontario-by-public-health-unit-phu
    I believe it is open data as it follows the FAIR principles and states in its metadata that its license is Open Government Licence – Ontario. The restriction is acknowledgement with attribution statement when doing any of the following: Copy, modify, publish, translate, adapt, distribute or otherwise use the Information in any medium, mode or format for any lawful purpose.

  14. Susan Cox

    I located a website that I have used for teaching in health professional education — it is called Healthtalk.org and it is a repository of thousands of qualitative interviews with persons in the UK who have experienced a long list of health conditions. There are transcripts and audiofiles for most of them so it is very useful. Researchers can become members and contribute data as long as it conforms to ethical and methodological requirements. Public access to the data is available through their website. But the data is not truly open. The policy states “You are not permitted to change, store, copy, publish, sell or distribute any such material without our prior consent in writing. The personal use of the website is free and covered by these Terms and Conditions of Use but we ask that organisational/corporate/institutional users pay a small annual support fee to help the dipex charity with the running costs of the website.”

  15. Ksenia Cheinman

    I found a data set of English-French Bilingualism https://open.canada.ca/data/en/dataset/a0bff264-1c80-41ee-aef9-e7da347c5158 by different levels of geographies from Open Government, Canada.ca. This data is open, as it is released under the Open Government License that allows to “copy, modify, publish, translate, adapt, distribute or otherwise use the information in any medium, mode or format for any lawful purpose.” It is also available in a csv format, which makes it more interoperable.

  16. Lindsay Cuff

    I looked at the Catalogue of Life dataset: https://www.catalogueoflife.org/
    I found it it very difficult to discover what the terms of the license actually were!
    At the footer of the page, there was this license: © 2020, Species 2000. This online database is copyrighted by Species 2000 on behalf of the Catalogue of Life partners. Unless otherwise indicated, all other content offered under Creative Commons Attribution 4.0 International License, Catalogue of Life, 2021-05-07.
    There is a whole page about how to cite each checklist properly and the COI is currently developing a DOI system to track usage of each dataset.

    Even though this dataset is freely accessible by the public, I don’t believe this data is open according to the definition we learned in this module, namely that one cannot copy, modify, publish, translate, adapt, distribute or otherwise use the information in any medium, mode or format for any lawful purpose.

  17. Claire Swanson

    https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70225094

    I looked at a dataset from the Cancer Imaging Archive. It does look like at least some of the data is machine readable as the clinical data is available as a CSV (I am not familiar with DICOM image files though). This dataset is not open, however, because it is listed as limited access. Clicking on the download option for either the image data or clinical data will direct you to make a request to use the data.

  18. Cari

    I took this opportunity to check out a local data set – the City of Calgary Open Data Portal. The data set I downloaded was Calgary Emergency Shelters Daily Occupancy (https://data.calgary.ca/Services-and-Amenities/Calgary-Emergency-Shelters-Daily-Occupancy/7u2t-3wxf). It appears to meet the definition of open data. It is available for download for free by any member of the public in non-proprietary formats like CSV. The license allows for reuse and adaptation for any lawful purposes as long as the source is cited (https://data.calgary.ca/stories/s/Open-Calgary-Terms-of-Use/u45n-7awa). One thing that would improve the future usability of the data set would be to provide more context for how the data being reported was collected – I believe that the individual shelters must have to report to the City’s Department of Community and Social Services as a funding requirement, but that isn’t clearly articulated on the site.

  19. Victoria B.

    https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=7724#!/details

    The dataset I found is from the Economic and Social Research Council. It’s called “European Quality of Life Time Series, 2007 and 2011: Open Access.” I think this is open data because anyone can access the data by downloading it from the internet, in a variety of file formats (e.g., CSV, SPSS, Stata, TAB). The data appears to allow for reuse and redistribution and provides citation information that allows users/researchers to point to where they accessed the data. I do not notice any restrictions on the use of the data. There is also a note that “The Data Collection is to be made available to any user without the requirement for registration for download/access under a Creative Commons Attribution 4.0 International Licence.” So, I think this is open data.

  20. David G.

    I have found this interesting data and resources titled “Safe food handling for immunocompromised individuals.” This data’s licence is “Open Government Licence – Canada.” When I click this link (https://open.canada.ca/en/open-government-licence-canada), it says I am free to “copy, modify, publish, translate, adapt, distribute or otherwise use the Information in any medium, mode or format for any lawful purpose” on condition of “acknowledging the source of the Information by including any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence.” From this, I can use this open data for my research studies.

  21. Ceciia Canal

    I found this data in the Yukon Government “Yukon Land Planning Information” – part of the Open Data from the Yukon.
    https://open.yukon.ca/data/datasets/yukon-land-planning-information

    This is the license https://open.yukon.ca/open-government-licence-yukon

    This is an “Open Government License” as it stated that “you are free to Copy, modify, publish, translate, adapt, distribute or otherwise use the Information in any medium, mode or format for any lawful purpose.” you must however “Acknowledge the source of the Information…The terms of this licence are important, and if you fail to comply with any of them, the rights granted to you under this licence, or any similar licence granted by the Information Provider, will end automatically.”

    So I can definitely say that this is Open Data, It also has other formats available, like JSON & RDF.

  22. Michelle Johnson

    With my institution located in Oshawa, I decided to look into the open data provided by the city. I did find some inconsistencies in some of the license information but overall, I think they uphold their promise of “open data”. https://city-oshawa.opendata.arcgis.com/datasets/DurhamRegion::health-neighbourhoods-demographics-indicators/about

    The data is easily downloadable in several different file formats and the Shape File, I learned is relatively interoperable with GIS software beyond simply ArcGIS. My only concerns is the downloadable data does not come with its descriptive metadata. The information on how to discern the labeling in the .CSV file lives on the Oshawa Open Data website. While I don’t pretend to be an expert on open data, from what I’ve learned this course, I wish there was a read me file along with this data that way, even offline it is still following FAIR principles.

  23. Amber Gallant

    I decided to examine RepoData, a “project to identify, gather, standardize, and make publicly accessible United States archival repository location data”. While the project has wrapped, during its run it was able to create a public map of over 18,000 repositories and archives of all types in the United States that are vulnerable to climate change (https://www.arcgis.com/home/webmap/viewer.html?webmap=6cc5e9301e28453cba9737f7e8d284df&extent=-125.6236,25.3089,-68.8902,52.8456). It also includes a GitHub repository of the publicly-available data used (https://github.com/RepoData/RepoData) and documentation about the project (https://osf.io/cft8r/). While determining whether the data was truly open data, I found that the GitHub repository contained a README file stating that the data is licensed under ODbL 1.0 license (Open Data Commons Open Database License: https://opendatacommons.org/licenses/odbl/summary/.) Any reuse of the data must acknowledge both RepoData and the original source of the data points. The data itself is easily downloadable as .CSV files; it is interoperable, and the license follows the rule of universal participation. Therefore, I can say this is open data.

  24. Majid Alimohammadi

    I explored the “http://www.emdataresource.org/” URL.

    I searched for a specific image “https://www.emdataresource.org/EMD-25707” and I found the data to be open. I had full access to all information required, including: experiment information, mapping, data validation and etc. This makes the experiment reproducible.

  25. Lilian H

    I chose Aozora Bunko: https://www.aozora.gr.jp/

    Availability and Access: The data is available at no cost and available as zip/ebk/html files.
    Reuse and Redistribution: The data is provided under terms that permit reuse and redistribution.
    Universal Participation: Everyone can use, reuse, and redistribute work from Aozora Bunko.

  26. Haruki S

    For this activity, I chose Internet Archive: https://archive.org. It is not only literature that you can access on this website, but also classical movies from all over the world. It is free and available to anyone. It is such a useful resource for studying cinema studies courses.

Leave a Reply