The Analogy of Science and Language: Representation issue and game perspective

Introduction

The thesis of this essay is that even if science starts as a universal knowledge system following a logical structure applied to all humans, at the end the outcomes are defined from the game we are playing with language.

Part one of the essay will draw the analogy between language and science as a cognitive process. To do so, a focus on the logical structure between language and science will be done. The aim of this section is to prove that, as it is the case for language, the issue of science starts not in the logical process, but when the scientific activity is shared. The second part will explicitly describe this issue, pointing out that the problem is created by representation and the processes connected to this phase of science. In support of this a case study will be given. Part three will make the described issue more practical, understanding it under a semantic perspective. The aim is to show how scientific knowledge is a matter of semantic understanding. For this the concept of the language game of Wittgenstein is given.

  1. Science and language as cognitive systems

Currently, there are more than 6000 languages used in the world (Klappenbach, 2022) . All these languages can be understood in the same way. According to Chomsky (2007), languages are cognitive systems with the function of communication. A cognitive system is a system which is able to interact with the environment and to analyze structured and unstructured data through transforming the information into knowledge. 

Languages must be able to render explicit the implicit knowledge of the speaker. Looking at the structure of language, linguistic studies divide it into two main components – the surface structure and the deep structure (Pelletier, F., & Pullum, G., 2022). The first one refers to the structure of grammar, the organization of sentences, the syntactic and writing rules, the phonological components, and the logical form. The second element involves semantic components used to interpret the meaning of certain words and sentences. This is part of the communication process and language which, from an internal activity of thought, becomes shared in the world. While the problem of language is to make the message explicit, at the same time the problem of science is to make knowledge explicit. As in science, when language is shared, it can be misinterpreted. In the following part of the essay, the elements of the surface structure will be compared with the elements of science. Further, Chomsky’s views on language being a rational process and the analogy with science will be analyzed in a sub-section.

1.1 Systems structure 

Science, similarly to language, has different components and, as it is in language, these components can be distinguished. Following a scheme of these components, their structure inside the cognitive system and relations is given. 

Figure 1. Science (Karaca, K. (2021, [PowerPoint slides]). Language (Chomsky, 2007 p.165)

Both cognitive systems start with an observation of the phenomenon which they have to analyze as a whole, relating it to the bigger picture. It could be, for instance, a situation of someone screaming or an apple falling from a tree. In both cases, before the subject tries to relate the phenomenon to the bigger picture, the meaning of it, and its understanding, the subject only observes and interacts with his five senses. The second step in the process is the interpretation of the phenomenon, which in language is the semantic understanding, and in science it is the hypothesis and auxiliary assumptions. The third phase is the analysis of each element and their relations between each other. In science, this leads to the prediction of the phenomenon, which then enables the realization of the empirical test. As described, the following relations are obtained: 

L1 = S1,S2 

The base components are the elements, and similarly, the observation and question are the basic elements for science to begin.

L2 = S3,S4

Semantic understanding takes place

L3 = S5,S6

In order to make predictions the relations of single elements, and their role, have to be analyzed.

What is important to add inside the process is that the relations L1→L2→L3 are possible because of the “memory” in transmitting the information in each process. Without this “memory” it is not possible to create the process because every time one has to “remember” who related each phase to each element. This is a further argument of science from a cognitive system perspective. The “memory” is applied in the logical structure. While there are several structures that make a language, in science it is modus ponens, following the form of: 

P —> Q (If P, then Q) 

Q

=====

This applies when forming a single sentence and to the way one builds it, especially when relating statements to each other.

1.2 Science as language

According to Chomsky (2007), language is not an empirical, but rational process. This means that humans have some innate properties for creating language, in the sense that there are some cognitive systems which enable the realization of it . Empirical view, on the other hand, means that the moment someone is born, they are like a white piece of paper on which one can inscribe anything, and which does not have pre-given properties.

In this perspective, science could be defined as a language which has a rational basis (memory and logical form), with the “grammar” being the empirical world. This means that the object of analysis and reasoning is empirical data.

It has been argued that science follows a deductive logical structure with the form of modus ponens. However, the central role that empirical data has as an object in science inquiry can create the issues of inductive risk. It is the risk of accepting or rejecting hypothesis with less than the amount of empirical evidence needed for certainty (Karaca, 2021). This issue raises questions on the truthfulness of a theory, and which elements influence the process of development and representation of theories and knowledge. 

This essay argues that science has similarities with language, including the issues related to both. One of them is the problem of knowledge transmission from the first individual to others. The fact that information mediated under the subjectivity of the individual is passed on to other individuals raises issues of semantics and representation. This is further explored in the following second part of the essay.

  1. The issue of misrepresentation

The knowledge of science gets meaning from the moment it is shared in society. This communication of the information, sharing it in society, can create issues. Representing means putting some abstract knowledge in a form of theory or ideas, and representing it in a visual form. During this process, two things can happen. Firstly, information in the process of translation between the forms can be lost or changed. Secondly, the reader of the information might misinterpret the knowledge that is written, or their understanding can be deceived by his or her bias. Following the second perspective, the culture in which the information is shared plays an important role in mediation. 

In his work, Giere (2004) underlines the issue of representation in science. The author, making an analogy with language, states that science as language carries inside it an object which must be represented for communication. This is a twofold relationship between the linguistic entity and the world. Giere suggests that, when starting an analysis of the practice of science, the first point of inquiry should be the process of representation. To understand it better, the following relation is given:

S uses X to represent W for purpose P

[Giere, 2004 p.743]

where S is the subject, W is an aspect of the world, P is the purpose of the subject, and X is the element that the subject uses to represent the aspect of W. The issue at hand is in the transfer of abstract or theoretical information between S and X and how X is related to P. The element to be investigated is X, and the relationship with S and then to P, questioning the semantic relations, the non-epistemic values embedded inside it, and the inductive risk that arise from the relationship. Furthermore, this element can increase cultural biases and inductive risk (as argued in Douglas, 2000), and give a misrepresentation of the world. In favor of this, a case study regarding Google Translate is described below. 

2.1 Google Translate and gender bias

In their study, Prates et al. (2019) examine the gender bias in AI present in Google Translate. Prates et al. (2019) show that Google Translate presents a strong tendency towards male defaults, especially for fields that are typically linked to the unequal gender distribution or stereotypes, like STEM jobs. The researchers analyze the problem through taking a list of job positions (drawn from the US Bureau of Labor Statistics – BLS) and using them to form simple sentences (e.g. He/She is an engineer) in twelve gender neutral languages, e.g. Chinese and Hungarian. These sentences are translated into English using Google Translate, which later allows for collecting statistics about the frequency of male, female and gender neutral pronouns in the received translation output. These results are compared to the BLS data for the frequency of women’s participation in each of the occupations. What was found is that, for instance, a sentence in gender neutral Turkish ‘o bir doktor’ is translated by the system into English as ‘he is a doctor’.

The authors argue that it can be assumed that a statistical translation system should show, at most, the unbalance and inequalities in society, since it is logical for an automated translation tool to be based on examples produced by society, and therefore will contain some biases. Further on, depending on what language a person speaks, it affects their knowledge and cognition of the world, which in turn means that languages which grammatically distinguish between genders may emphasize a bias in that person’s world perception. However, what the results of the study show is that male defaults are exaggerated, and not just prominent, in fields touched by gender stereotypes. In other words, Google Translate fails to render the reality of the female worker distribution and enhances the stereotypes even more (Prates et al., 2019).

The way Google Translate works is through using English as the lingua franca for translations between other languages (e.g. Italian → English → Polish). This can cause errors in translations, making them inaccurate or, as in this case, gender biased. This issue can be compared to translating science information into scientific representations.

  1. The Science Game 

What was said until now is that even if science as language follows a logical structure which can be applied to all human beings, when the representation of the knowledge or message created through the logical process is shared under representation, issues arise. The reason for this, it was argued, is because of the transformation of information during the process of translation, and because of the semantic interpretation embedded inside it. 

The semantic issues of language in science are not new. For this it is useful to relate back to the language representation issue and focus on two relevant points – the issue of semantic incommensurability between natural languages and the issue of metalanguages and logico-mathematical syntax. Firstly, not all language can describe a certain scientific theory, and additionally natural languages are not objective languages. Secondly, certain disciplines use different metalanguages, for instance, mathematics has different meanings in physics and in engineering. What is more, for example in physics itself, mathematics also has different meanings, e.g. in classical physics and quantum physics. What can be, therefore, stated is that science also has a use of language which presents itself under the form of a metalanguage, which is mathematics. Metalanguages work better than natural languages in obtaining objectivity. However, as pointed out by Carnap (inside Irzik, G., & Grünberg, T., 1995) , metalanguages also bring a problem of interpretation. Semantic incommensurability refers to the different meanings of representation in the same domain.

What this essay aims to do is to investigate this issue from a different perspective. Following Wittgenstein’s framework, it is argued that science knowledge, as language, when shared in society takes part in several games, and these games create the practical outcomes of science in society. With games it is meant that a certain entity as a whole (i.e. language or science) has different meaning and value depending on the context and use. According to Wittgenstein (inside Coeckelbergh, 2017), language lives in the use (i.e. form of life). Language has to be understood in its use and therefore it is interwoven with people’s activities and games, each with their own rules.

Because language is related to the way we live, imagining a language in fact means imagining a form of life (i.e. something that is in use; if it is not, it is not life). As described in the first part relating to Chomsky, also Wittgenstein divides language in two structures, deep and surface. In this context, the deep structure is taken under analysis, and it is here where the meaning, and games, and life forms take place. Furthermore, in Wittgenstein’s view language has two dimensions, normative and social. The normative dimension, which tells one what to do, how to use language. The social dimension, on the contrary, refers to that knowledge of activities and games is shared with others, and that language is historical and changes with time and individuals. All this, according to Wittgenstein, is part of what humans are. Translating this in support of the essay thesis, science as a cognitive system involved in the practice of knowledge is part of a form of life, part of what one knows and how they know it; it is part of what we are.

As suggested by Coeckelbergh (2017) society is linked to the use and interpretation of science through the patterns of games, form of life, grammar (i.e. structure). These patterns are what determines the understanding of scientific representation in society. Moreover, it is what makes science embedded in the large structure of society. After that scientific knowledge is represented and the information is shared in society, traveling through different contexts, each part of a game that is constantly changing. Historical processes, as well as values and norms, can change the meaning of scientific knowledge in society, changing the practice that takes place. Furthermore, different social groups with different interests can interpret the same scientific knowledge in different ways, depending on the game they play (element P in Giere’s relation). Following Wittgenstein’s view, games change the value and meaning. These elements shape the uses, creation and interpretation of science, shaping and scripting the morality inside it.  

The arguments developed here suggest that science starts as an objective logical process, but ends up being subjective. This paper develops a view that the outcomes of science are not objective but depend on the games which they are part of.  

Conclusion

This essay argued that even if science starts as a universal knowledge system following a logical structure applied to all humans, at the end the outcomes are defined from the game we are playing with language. It first drew an analogy between language and science as a cognitive process, and aimed to prove that the issue of science, similarly to language, starts when the scientific activity is shared. This problem was further explained through referring to the notion of representation, which was subsequently supported by a case study. Finally, the essay focused on showing that scientific knowledge is a matter of semantic understanding, which was based on Wittgenstein’s language game concept.

Reference: 

  • Chomsky, N., & Ronat, M. (2007). On language. New York: The New Free Press.
  • Coeckelbergh, M. (2017). Technology Games: Using Wittgenstein for Understanding and Evaluating Technology. Science And Engineering Ethics, 24(5), 1503-1519. doi: 10.1007/s11948-017-9953-8
  •  Douglas, H. (2000). Inductive Risk and Values in Science. Philosophy Of Science, 67(4), 559-579. doi: 10.1086/392855
  • Giere, R. (2004). How Models Are Used to Represent Reality. Philosophy Of Science, 71(5), 742-752. doi: 10.1086/425063
  • Irzik, G., & Grünberg, T. (1995). Carnap and Kuhn: Arch Enemies or Close Allies?. The British Journal For The Philosophy Of Science, 46(3), 285-307. doi: 10.1093/bjps/46.3.285
  • Karaca, K. (2021). Scientific Method and Its Logical Structure and Problems [PowerPoint slides]. BMS, University of Twente, 16-11-2021
  • Karaca, K. (2021). Values and inductive risk in machine learning modelling: the case of binary classification models. European Journal For Philosophy Of Science, 11(4). doi: 10.1007/s13194-021-00405-1
  • Klappenbach, A. (2022). Most Spoken Languages in the World 2022 – Busuu Blog. Retrieved 4 February 2022, from https://blog.busuu.com/most-spoken-languages-in-the-world/#:~:text=Well%2C%20roughly%206%2C500%20languages%20are,spoken%20by%20only%20eight%20people.
  • Prates, M., Avelar, P., & Lamb, L. (2019). Assessing gender bias in machine translation: a case study with Google Translate. Neural Computing And Applications, 32(10), 6363-6381. doi: 10.1007/s00521-019-04144-6
  • Pelletier, F., & Pullum, G. (2022). Philosophy of Linguistics (Stanford Encyclopedia of Philosophy). Retrieved 4 February 2022, from https://plato.stanford.edu/entries/linguistics/#LinNat

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s