Ethical Conditions - Personal Data Protection - Copyright and Intellectual Property Protection

Ethical conditions of work with social data

There are numerous codes of ethics and sets of standards that apply to empirical social research, e.g. following:

The main ethical requirements of data management, beyond the general requirements of the quality of scientific work, can be summarised as follows:

  • Respondents should be protected from the potential harmful effects of research even after the stage of field data collection has been concluded, in particular whenever the data are worked with, archived, made available, or made subject to secondary analysis. In general, information of an individual nature about survey participants and other personal data is confidential, and this confidentiality should be maintained. Special attention should be paid to sensitive information.
  • Respondents must be treated with respect and have the right to know the purpose and methods of utilisation of the information they provide and to decide about the ways it can be utilised. Consequently, their decisions must be respected.
  • Adequate utilisation of the information gathered in line with the purpose defined should always be ensured, not only to fructify the efforts respondents made to participate in the research study. Data gathered with public funding must be utilised as much as possible and, whenever the nature of the data allows, made available to the broader scientific community.

Personal data protection

The issues of personal data protection should be given adequate attention as early as the stage in which a research proposal is drafted. To underestimate them would not only constitute a violation of research ethics but might also restrict or completely prevent the researchers’ intentions from being fulfilled and in particular the data from being made available for secondary research. The following should be clear from the beginning:

  • Is it necessary to obtain respondents’ informed consent for personal data operations?
  • Will the data have to be anonymised?

A simple yes/no answer to these questions is insufficient; additional details are important. We need to exactly identify the phases of the research and the data life cycle in which the presence of personal information on respondents is unavoidable. Then we should plan our data management in such a way that it avoids any unnecessary operations with personal data and institutes adequate personal data protection measures where such operations cannot be avoided.

The following overview is based on Czech legislation:

On the one hand, legal regulation in European countries is to some extent similar because it is based on a common directive of the European Union. On the other hand, there are significant differences. For example, the Czech Republic does not apply specific rules to personal data processing for scientific purposes and its laws in this respect are among the strictest.

  • Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data
  • Consider that in 2012 the European Commission proposed a new EU General Data Protection Regulation that should supersede the Data Protection Directive. The rules of personal data protection in research are considerably more stringent in this proposal, but in some aspects also closer to current Czech law. (CESSDA: Individual Privacy Rights Strengthened – Research Possibilities Restricted)

Personal, sensitive and anonymous data
“personal data” shall mean any information relating to an identified or identifiable data subject. A data subject shall be considered identified or identifiable if it is possible to identify the data subject directly or indirectly in particular on the basis of a number, code or one or more factors specific to his/her physical, physiological, psychical, economic, cultural or social identity;’
“sensitive data” shall mean personal data revealing nationality, racial or ethnic origin, political attitudes, trade-union membership, religious and philosophical beliefs, conviction of a criminal act, health status and sexual life of the data subject and genetic data of the data subject; sensitive data shall also mean a biometric data permitting direct identification or authentication of the data subject;’
“anonymous data” shall mean such data that cannot be linked to an identified or identifiable data subject in their original form or following processing thereof;’
Czech Republic, Act No. 101/2000 Coll., Article 4

A general rule for any research study based on the collection of information from respondents is to obtain their informed consent. The data subject must at least be informed properly and in advance about the purpose of the data processing, the scope of the personal data, the name of the processor, and the time period the consent is given for. When it comes to so-called sensitive data, in practice consent must be obtained in writing and must preferably also signed by the respondent to demonstrate the existence of consent as required. The respondent is also entitled to request further information about the data processing, and if the reasons for which consent was obtained cease to exist, the data processor must stop processing, i.e. must liquidate the data.

Processing of sensitive data
The following rules apply to the processing of sensitive data in research. Sensitive data may be processed: if the data subject has given his express consent to the processing. When giving his consent, the data subject must be provided with the information about what purpose of processing, what personal data, which controller and what period of time the consent is being given for. The controller must be able to prove the existence of the consent of data subject to personal data processing during the whole period of processing. The controller is obliged to instruct in advance the data subject of his rights pursuant to Articles 12 and 21…’ Czech Republic, Act No. 101/2000 Coll., Article 9

In order to process sensitive data, the data processor must also register their activity with the Office for Personal Data Protection. In other words, any institution planning to implement a research project that includes the processing of sensitive data must have a relevant reason for this kind of activity, have an adequate structure for securing the protection of such data, and must officially register in time, i.e. before the data processing begins.

The implementation of this kind of data management itself entails, of course, additional organisational requirements and expenses. For this reasons, it is necessary to consider carefully which exercises essentially depend on processing personal data and obtain informed consent and implement data protection measures for these exercises. However, even if social research tends to rely on the collection of individual data, it seeks to obtain aggregated information about society. Thus, personal data can often be omitted altogether or at least in some research stages. If this is the case, the data should be collected as anonymous or should be anonymised as soon as circumstances allow. Furthermore, there are also organisational reasons for doing so; informed consent is easier to obtain for a limited time period and a clearly defined purpose than for an extensive research exercise where respondents do not clearly understand the purpose or the consequences for them personally.

For example: While random sampling tends to identify specific addresses, the research study itself can make do without direct identification of households and respondents. Therefore, the dataset does not have to include such direct identifiers. If the database does not include so-called indirect identifiers either, then it is anonymous and no informed consent is required for analysing, archiving or sharing the data. Similarly, in a panel survey, we need to preserve the addresses in order to implement follow-up waves and survey the same units, but not for the analysis itself. Thus, addresses can be kept separately from the data collected. The database of addresses will be treated in line with the Personal Data Protection Act, while the dataset for analysis will remain anonymous.

Direct and indirect identifiers
Direct identifiers include names, national identification numbers, addresses etc. Indirect identifiers make a person’s identification possible when associated with other known data. Such identification may be possible also by combining data from multiple different variables in the data file. For example, if the dataset contains information about a person’s job as mayor in a given city, then a specific person holding that office can be identified even though the database does not contain that person’s name.

Another situation frequently arises: at a certain stage, often just for the purposes of data collection or building connections between databases, personal data has to be used, but in all the other stages of the research the personal data of respondents can be omitted – and discarded. The process of discarding identifiers is referred to as anonymisation (see Manage Data during the Research Process).

If the database contains direct or indirect identifiers and cannot be anonymised, we must obtain informed consent from the respondents and count on spending money on personal data protection. If the database is not anonymous, given the topics typical for social research, which often involve sensitive data, in the Czech Republic consent should be given in writing and data protection measures must be more thorough. At the same time, this poses an important barrier to data sharing. The purpose of data processing must be formulated in specific and time-limited terms in the informed consent request. In the Czech Republic, it is impossible to obtain consent to the processing of data for an unlimited time, for an unknown purpose, or for the purpose of sharing it with anybody. As a result, non-anonymous databases are usually not made available in data archives.

Copyright and Intellectual Property Protection (IPR)

Copyright and intellectual property protection is complicated and a thorough treatment of these issues requires professional legal advice. In each research institution such legal advice should result in the creation of ground rules and standard practices for its employees to follow. Nevertheless, each researcher should be aware of at least the basic contexts.

Copyright guides at other data organisations:

In the Czech Republic, intellectual property rights are treated in particular in the Copyright Act, i.e. Act No. 121/2000

Copyright covers any works which are the unique outcome of the creative activity of the author and are expressed in any objectively perceivable manner including electronic form, permanent or temporary, irrespective of their scope, purpose or significance.

‘The subject matter of copyright shall be a literary work or any other work of art or a scientific work, which is a unique outcome of the creative activity of the author and is expressed in any objectively perceivable manner including electronic form, permanent or temporary, irrespective of its scope, purpose or significance...’ Czech Republic, Act 2000/121 Coll., Article 2 (1)

‘...A database which by the way of the selection or arrangement of its content is the author’s own intellectual creation, and in which the individual parts are arranged in a systematic or methodical way and are individually accessible by electronic or other means, is a collection of works...’ Czech Republic, Act 2000/121 Coll., Article 2 (2)

   >>> A database which, by way of the selection or arrangement of its content, is the author’s own intellectual creation, and in which the individual parts are arranged in a systematic or methodical way and are accessible by electronic or other means, is a collection of works, and as such it is covered by the Copyright Act. Copyright arises when the database is being created. The fact that a database does not bear a ‘copyright’ label does not exclude it from this legal framework.

   >>> Copyright protection covers the authors’ work, not the individual facts stated in it. As far as databases are concerned, this means that copyright covers the selection and arrangement of data in a database etc., while its content may not be covered, depending on what exactly the content is. For example, for an in-depth interview, copyright to the recording is held by the researcher, while the rights to the individual statements remain with the informant.

   >>> Copyright protects intellectual property from unauthorised distribution, given the potential loss of income and moral damage. The rightholder chooses the ways of disposal of his/her work and decides about its distribution. Nevertheless, copyright is not infringed by anybody who in his or her own work to a justified extent uses excerpts from the work of other authors, or small works in their entirety, for the purposes of the critique or review of such a work or for the purposes of scientific or technical work, or uses the work while teaching for illustrative purposes or in non-commercial scientific research. However, in doing so, it is always necessary to cite the name of the author, the title of the work, and the source.

>>> Through a written license agreement, the author can grant authorisation to use the work, either in specific ways or in all ways of use, and either to a limited or to an unlimited extent. A license can be either exclusive or non-exclusive. In the case of an exclusive license, the author must refrain from further distribution of and from exercising the rights to use the work to which he granted the license.

   >>> The copyright belongs to all the authors of the work, for example, the entire research team, and not only the team leader or the project’s principal investigator. The same applies to university research: the rights do not belong to the teacher only but also to the students who participated in organising the research study. However, a person who has contributed to the creation of the work merely by providing assistance or advice of a technical, administrative or expert nature or by providing documentation or technical material, or who merely gave the impulse to create the work is not considered to be a joint author.

   >>> Databases are often created in the framework of an employment relationship. As a rule, the employer exercises the author’s economic rights to a work in his or her own name. Economic rights cover the different ways of using the work, e.g. reproduction, distribution, exhibition, lending, or making the work available. The author’s moral rights, e.g. the right to claim authorship, the right to the inviolability of a work (alterations), or the right of supervision over compliance with obligations, remain unaffected.

   >>> Thus, authorisations for the secondary use of or access to a database in an archive are often granted by the employer, rather than the authors’ team. In this respect, it is worth mentioning that most students are not employees of their university, which means that economic rights to their works are not transferred to the university in their entirety. In some cases, too, academic institutions transfer economic rights to their employees, especially for the purposes of publication activity; sometimes the scope of these institutional rules includes other outcomes and activities as well, which may affect the regulation of rights to databases.
(The student-author-university relationship is more complex. Schools or school-related or educational establishments have the right to conclude, under the usual terms, a license agreement on the utilisation of a school work. Unless otherwise agreed, the author of a school work may use his work or may grant the license to any other party, unless this contravenes the legitimate interests of the school.)

   >>> Databases can also be created and shared in an environment of wide-open collaboration based on free licenses such as Creative Commons. Then, users can not only utilise the database but can also contribute to it, expand it, update it, or make other alterations, subject to license conditions.

Prepared by: Jindrich Krejci, 2013 - 2014