
'Gramophone Man' - Jeffrey Richter
As I have stated in my learning agreement. I consider my project Id Machine to continue past BA (Hons) evaluation. My intention is to use Id Machine as a research project, although my ambitions are far stretching, I have in mind and in planning, practical solutions to reach them.

The direction my research will take after my EMP critique is toward the content of audio input. As my project stands it is the qualities of the voice I’m using as my attributes. For now it is frequency that the system uses as data to judge by. The next step then, is to attribute more detail to analysis of the audio input data stream. With a firmer grasp of the characteristics of the recordings made, a more engaging and dynamic ‘conversation’ is possible
I understand that there is a long history in the direction of conversation between man and machine and I will continue to reflect on the outcomes and impacts that have prevailed through this line of discourse further and later.

Watson robot on Jeopardy
Chat bots are designed for this very purpose (conversation), in most cases such as Eliza, the ‘conversation’ is directed to persuade the participant that they are conversing with another human being, this is usually done through complex systems including decision trees and language processing techniques such as the use of keywords and referencing e.t.c. The success of this belief that a machine can fully converse like a man is considered as a milestone in artificial intelligence since the proposal of the Turing test in the article “Computing Machinery and Intelligence” (1950) and since then, it has been a subject of great focus for computer scientists, philosophers and many others alike.
It is important to note the creator of Eliza used the project to some extent to debunk the test and to showcase the artifice of an attempted project.
“machines are made to behave in wondrous ways, often sufficient to dazzle even the most experienced observer. But once a particular program is unmasked, once its inner workings are explained … its magic crumbles away; it stands revealed as a mere collection of procedures … The observer says to himself “I could have written that”. With that thought he moves the program in question from the shelf marked “intelligent”, to that reserved for curios”-Joseph Weizenbaum
I am acutely aware that my own project is bound to suffer from the same sense of falsehood, but my intent is not to fool an engaged viewer with false characteristics of a human speaker/conversationalist. But to immerse a participant into an engaging scenario in order (to some extent) for them to suspend their disbelief and to produce thoughtful and emotional interaction in a similar way to the suspension of disbelief in cinema and other narrative productions.
I will produce a machine…a system that will be imbued with elements of personality. Not one that intends to directly copy a person in human to human interaction.

anthropomorphisation in film
Storytelling has long used inhuman symbols and through processes such as anthropomorphization… these characters have caused audiences to empathize and related to them. Id Machine will have some very simple qualities of human conversation, but it is not the inherent qualities that are of interest…it is the action, reaction and projection of those that interact with it. I will direct the scenario in a similar way to that of a theater production.
What is unique about my project, is the fact that the content is entirely user generated. All input and output is audio recorded during the interaction. This work is also generative. The more participants interact with it, the broader the scheme of content. When this becomes the case each interaction becomes close to unique.
Onto practical applications. The first step is in matching audio input to models of simple conversational elements. for example I may be able to make an educated assumption of the participants age and gender by mapping the audio input against a model such as this

Pitch change through aging (Male and Female)
I could then include this assumption as metadata applied to the audio recording itself. Of course I will need more accurate models and equipment to get to this level.
I could also generate a model of conversational elements, for example a model of a question, this could be determined through analysis of pitch inclination (see proposed model below).

Potential input matching model example for 'question' inclination
I may then be able to apply this method to more abstract concepts for example a model of intent, perhaps I can determine aggressive input by creating a model through a combination of vocal phrasing and volume analysis in order to create an assumption that if the input is loud and short in delivery that these properties represent aggressive intent in the content.

Potential input matching model example for 'question' inclination
Once I am able to determine these elements of conversation, fully and with accurate definition. I can produce greater ‘intelligence’ in my system by providing broader and more focused decision making elements against my stream of data.
The next major step is in determining aspects of conversation dynamics, the process of conversation is an integral part of socialising, this in turn is part of an integral desire to communicate. The way in which we communicate is incredible complex
“We are the only creatures on earth (as far as we know) that can remember the past as discrete events, then connect those events with present conditions. Then, on the basis of those connections, we can consciously decide what to do, and project possible present actions into the future consequences of those actions. Thus, unlike other animals that react to stimuli as they occur, humans live not only in the present, but in the past and the future. It is this ability to remember the past, relate it to the present, and project into the future that is a special province of humans”- ‘Taking ADvantage: Social basis of human behavior‘- Richadr F Taflinger
this will be a major topic of research later in my project. For now I will deal with assumptions backed by research. This research will inform my direction of the intended scenario. for example behavior during the act of turn taking during a conversation

“Turn-taking dynamics. In a given conversation, only one person (the speaker) speaks at any given time before another (different) participant (a listener of the same group) is entitled to speak. Within a single conversation, several non-overlapping sub-conversations can nucleate”-Massimo Mastrangeli, Martin Schmidt and Lucas Lacasa (2010)
This element will play heavily during the experience of my installation. Instantly a power dynamic is created simply through the act of speaking and listening. The act of turn taking will produce interesting sociological and psychology information.
“Joining/leaving force balance. Participants in a specific conversation remain in the conversation as long as they feel actively involved in it up to their preferred degree; otherwise, they start to wish to leave the conversation. We model this lively behavior by assigning a degree of happiness to each participant of the conversation. Happiness hereby stands for e.g. attention span, patience, assertiveness, self-esteem, and more: it is the index of the willingness of a participant to remain in a given conversation”-Massimo Mastrangeli, Martin Schmidt and Lucas Lacasa (2010)
This is of great concern for my project, participation in ‘conversation’ is key for all other information gathering and the overall quality of content. I will need to think further of ways to derive information of intent from a user. It will be difficult for me to accurately measure a participants will in a live scenario so I must think around this problem.

Rorschach tests
Ideally the the scenario I’m looking to create is a kind of Rorschach test in which the interpretation and response of the participant is a point of interest and study. What’s more this particular interaction means that the participant is both providing information and interpreting information that others have provided. The interaction becomes a dynamic psychological environment
To do this propely I need to have greater control and direction for the audio output, I need to understand how conversation works and what other parameters I can use as data to process deeper in order to engage the participant further and provoke conversation.
Like this:
Like Loading...