I’m at SARMAC, a conference with a number of research papers on the psychology of lying and lie detection. I’ll liveblog the relevant sessions in followups.
I’m at SARMAC, a conference with a number of research papers on the psychology of lying and lie detection. I’ll liveblog the relevant sessions in followups.
The keynote talk was not directly relevant to deception, but was interesting, so I’ll put my notes here anyway. It was by Jutta Joormann on “Cognitive biases, rumination, and mood regulation in depression.”
Depression is the leading cause of years of life lived with disability; the mortality rate is elevated by 2.6; 30m Europeans, 20m Americans are affected. The classic model is that negative mood activates negative memories and cognitions, setting up a vicious circle. More recently, people think of emotion regulation: people are active in responding to affect (Teasdale 1988: able and motivated to repair mood). A ruminative response style (Susan Nolen-Hoeksma) makes it harder to use other strategies such as reappraisal. Cognitive biases and (now) inhibitory deficit may make it worse. Cognitive theories of depression predict biases in all aspects of processes; most robust finding is memory biases (Matt et all 1992; more than 10% negative words recalled); attentional biases are inconclusive (Gilboa & Gotlib 97 etc). Eye tracking is the latest method: so is it engagement with negative information, of difficulty in disengagement? She let the subject see neutral and emotional faces; once the subject fixates on one face, frame the other (the subject was told to look at framed faces). This gives engagement and disengagement indexes. Depressed participants spend more time looking at angry or sad faces, and take longer to disengage from a sad one. These attentional biases are good at predicting “stage fright” – anxiety about giving a talk, used as a stresser.
Second experiment: cognitive inhibition is a very basic process, controlling access to working memory and discarding irrelevant information. So, is inability to inhibit negative material associated with ruminatve responses? “Please name the colour in which the following word is written: [red in green type] then [yellow in red type]” versus “[blue in green type] then [yellow in red type]” to measure priming. Negative priming doesn’t work well for both depressed and remitted-depressed patients.
Third experiment: can depressed people remove negative material from working memory? Use negative Sternberg task: green type “sad, ugly, depressed” and red “pretty healthy nice” then show red frame (supposed to forget list in unframed colour) then show probe word from one list or other. How much intrusion is there of the word they were supposed to forget? Depressed participants have a much harder time forgetting negative words (and this is significantly correlated with rumination, and also with sorting costs for negative material, though that’s more complex).
Relationship with mood regulation: one mechanism is to recall happy memories after distraction: works with controls but not with remitted depressed and makes major depressed patients worse. So: difficulties in repairing negative mood states are also associated.
Fourth expriment: is there a longer-term vulnerability? Ian Gotlib at Stanford is working with her on a longitudonal study of the risk to children of depressed mothers (this is known to be 3–5 times the risk to children of never-depressed mothers, with an elevated risk of substance abuse and anxiety disorders). They recruited never-depressed 9-14yo daughters of mothers with recurrent episodes of depression during the daughter’s lifetime (but well now) vs a control sample of never-depressed daughters of never-depressed mothers. This has gone on for ten years and is starting to provide data: Joorman, Talbot and Gotlib 2007 found attentional bias in risk group. Now they are doing fMRI work: they scan a girl, ask for a positive memory recollection, scan again, induce negative mood by a sad video, scan, one more positive recollection, and a final scan. High-risk: activation subgenual acc vs control subdorsal acc. It’s as if the at-risk group turn on the amygdala and don’t turn it off.
Finally: Can we modify the biases or inhibition effects somehow? She is trying attention bias training with high-risk girls (Tran, Hertal and Joormann; Tran, Siemer and Joormann 2010), trying both positive and negative training, which trains subjects to apply positive or negative interpretations to emotionally ambiguous situations. (By the way, the negative training group had higher fall in self-esteem after stressor task involving unsolvable anagrams.) A depressed group got positive bias from positive training. However attention training is very boring; will it scale outside the lab?
Iris Blandon-Gitlin, “Detecting Deception: The benefit of depleting executive control in liars.” Aldert Vrij and others note that deception detection is hard, and suggest cognitively demanding interviews: the “cognitive load hypothesis”. Iris’s lab works on such executive-control-depletion strategies. Inhibition is in greater demand during deception (vs working memory and attention). her methodology is to get users to rate views on controversial social issues such as abortion; next day, she gets them to lie about something they feel strongly about. The treatment group is depleted with ANT (attentional network task). Then record the lie and ask others to judge. Hypothesis: depletion most effective for unprepared senders. Discrimination goes up from 50% to 70%.
Iris Blandon-Gitlin, “Improving cognitive control in liars reduces observer’s lie detection accuracy”: this is the opposite of the above paper. Her novel executive control facilitation strategy is to use the inhibitory spillover effect: increase the urgency to urinate! This gets people more impulse control: they do well on cognitive tasks including the Stroop test, because the inhibitory control centres in the brain aren’t domain-specific. Again, people rated their views on social issues, and were told to lie about their views on abortion. They were then assigned to low or high bladder control conditions by giving them water. They lied or told the truth, and then did the ANT test as a manipulation control. 118 observers rated truth/lie in the videos and another 75 rated certain behaviours. Hypothesis: high-bladder have more control, feel less cognitive load and can deceive better. Found: high-bladder were more accurate, reported less load; deception detection went down from 55% to 48%; seemed more anxious but more confident. Effect sizes were small but significant.
Renee Snellings, “The effect of observer language proficiency on second-language lie detection.” Are investigators more likely to find second-language speakers deceptive? Yes, according to much previous work. Speech hesitation, and the number and length of pauses, can be associated with deception, as can fidgeting. She examined whether second-language interrogators might be better able to discriminate than native speakers. 284 undergrads from U Ontario, Quebec and Trois Rivieres watched fifteen videos of people lying or telling the truth and then did standard language tests. Observers were better able to discriminate lies and truth with first-language speakers, but not second-language speakers; but as their language proficiency decreased their lie bias increased, but not their ability to discriminate. So having a second-language speaker available to watch an English-language interview is not enough; law enforcement should have an interrogator interview a suspect in his native language.
Jacqueline Evans, “Detecting deception in non-native English speakers.” Many factors are involved, such as emotional distancing: you can be slightly disengaged using a non-native language. Can PBCAT (psychologically based credibility assessment tool, which she developed on an Intelligence Community Post-Doctoral Research Fellowship – see Evans, Michael, Meissner, Brandon JARMAC 2013) offset the language group effect? She did a 2x2x2 experiment true/false, native/second language, and PBCAT. Stimuli were alibi videos by hispanic speakers collected in Texas; UTexas students from El Paso (mostly hispanic) or Tyler (nondiverse); and the PBCAT group got 10 minutes’ training and used the form. She found that truth accuracy was higher with native speakers and deception accuracy with non-natives; PBCAT improved truth accuracy but not deception accuracy. No effect on discrimination; but PBCAT users were more likely to say T for natives and L for non-natives. Non-native speakers were more likely to not smile, and there were other complex multivariate effects. Contrary to popular supposition, the liars spoke faster. Didn’t find evidence consistent with cognitive-load theory; it might happen only with high-proficiency second-language speakers. This work also supports the argument for interviewing suspects in their native language.
Siegfried Sporer, “Why Scientific Content Analysis (SCAN) cannot work to detect deception.” Vrij reports that the most-widely used deception detection technique is SCAN, developed by Sapir, a former Israeli policeman, and you can spend a lot of money going on courses. The idea is that you get the witness or suspect to “write down everything you did from 2 to 4pm yesterday”; the interview is later. Analysis pays attention to whether people cross stuff out and so on; it uses both lie and truth signals, and is also used to look for mute issues. Sporer argues it can’t work because there’s no theory, for example on how lies and truths differ; some of their lie criteria are truth criteria in other systems; some of the criteria have vague definitions; there are variants of the system; there may be insufficient inter-rater agreement; and he’s written to its authors asking for their data and been rebuffed. There is a forthcoming paper (Bogaard, G., Meijer, E.H., Vrij, A., Merckelbach, H., & Broers, N. “SCAN is largely driven by 12 criteria: Results from sexual abuse statements” Psychology, Crime and Law) which may tidy some of this up but it’s drawn from two policemen’s work. A lot more data are needed, and ground truth. He suggests instead a Brunswik Lens Model (Sporer, Cramer, Masip, in press). In questions, someone remarked that a guilty suspect is often trying to avoid talking about something, and coding absence is hard.
Missy Wolfman, “Question and Answer: exploring the dynamic process in investigative interviews with children”. This is her first PhD study. Current best practice is to use open questions to elicit information from child witnesses, but studies show interviewers are not good at following this advice. So far, studies have looked only at the numbers of questions asked of different types; she’s looking at interaction effects. For example, closed questions may get longer answers, locking child and interviewer into a cycle. So she’s interested in question dynamics, adult-adult, adult-child, child-adult and child-child. She used Bakeman and Quera’s “Sequential analysis” for the first time in child abuse, studying taped sex-abuse case interviews from New Zealand of 24 interviewers and 80 children, average age just under 12. The method seems to be a Markov model of speech act types. She concludes that interviewers are poor at monitoring and assessing their own performance. In questions it was noted that the training of interrogators in New Zealand is essentially the same as that in Britain.
Session on forensics and uncertainty, after lunch.
Annelies Vredeveldt, “Eye remember what happened: Closing your eyes improves recall of events but not face recognition.” Subjects saw a theft, did a filler task, had an interview eyes open or closed, and later did a lineup recognition; this duplicated previous experiments by showing that eyes closed led to better recall of procedural details. However it did not improve lineup face recognition performance. In questions, it was suggested that eye closure during the recall task could help that, but would not help the later eyes-open recognition task; also, rehearsing the facts doesn’t help the face recall.
Kristy Martire, “Communicating uncertainty in forensic science evidence: Numerical, verbal and visual expressions of likelihood ratios and the weak evidence effect”. Kristy is interested in whether jurors understand forensic reports, and found in earlier work that they underestimate the intended value of evidence when it’s presented modally; “weak support for the prosecution” is interpreted as weak support for the defence! In this work she recruited 477 mTurkers to explore Bayesian updating thoroughly, given numerical (likelihood ratio) and verbal reports of evidence strength. The experiment strongly confirmed the weak evidence effect, but numerical assessments such as “4.5 times more likely” are more effective than the corresponding “weak or limited support” in the context of a crime-scene fingerprint match. She concludes that perhaps forensic examiners should rethink their verbal labels; but none of the verbal labels she tried so far gives a good match with the Bayesian effects. In questions, people discussed the difficulty of using phrases such as “significant but small” where anchoring suggests that the first word matters most. Also, experts are often presenting evidence in probabilist terms without actually having the numbers.
Kyle Sousa, “Screening for Terrorists: A cross-race analysis of face perception”. Does screener’s ability to match faces to passport photos degrade with cross-race effects? Increased security at US ports of entry has led to imposters – people using genuine IDs issued to others. ID vendors have boxes of hundreds of IDs and search for the best match. Magreya, White and Burton 2011 showed yes: Kyle confirmed this in a paper at Sarmac 2011, but noted that the photo age effect and the use of hat/sunglasses moderates this. Those studies assumed 50% imposters, but in real life a screener sees fraudulent documents rarely. So now he’s tested low-frequency own-race versus other-race at various base rates. All found the classic cross-rate effect, but no base-rate interaction was found. People are still more accurate at identifying people of their own race.
Matt Palmer, “Disconfirming feedback impairs subsequent eyewitness identification of a different culprit”. 15% of witnesses view more than one lineup (Halford 2009), so does post-lineup feedback affect subsequent decisions? In 2010 with Brewer and Weber he looked at two lineups for the same culprit, and expected response bias, but found effects on discriminability instead: positive feedback made people better on the second task, and negative feedback made them worse. However even in the first case, second lineups were never as good as single lineups. In the latest study, the second lineup was for a second perpetrator of the same crime. 330 subjects across 3 videos each with two distinct perps; first lineup, target absent; correct rejecters or foil choosers were given either no feedback or simple negative feedback (“the person you’ve chosen is incorrect); second lineup. Confirmed the hypothesis: negative feedback reduces discrimination, using ROC (following Mickes, Flowe and Wixted, “Receiver operating characteristic analysis of eyewitness memory” 2012). There are subtleties though as it depends on whether the feedback is interpreted by witnesses as commenting on their ability. Moral: if you’re a cop, don’t give eyewitnesses negative feedback.
Session on “Creating and maintaining distorted beliefs and memories”
Maryanne Garry, “Thruthiness and falsiness of trivia claims depend on judgmental contexts”. By truthiness, the speaker means cognitive availability plus confirmation bias: people are influenced by stuff that’s frequent, likeable or true. So how much can semantically unrelated photographs influence people’s judgments of truth? She asked hard true/false trivia questions, half the subjects being shown by related or unrelated photos while the other half had unrelated or none. With a related photo people were more likely to say “true”; with unrelated photos there’s the opposite effect (they make processing harder). This replicated previous work by her and others, and it’s known that the truthy effect fades with familiarity. So she re-ran and duly found the effect weakened. Perhaps photos make retrieval easier, or perhaps they make the trivia questions more concrete, which are easier to process.
Brittany Cardwell, “Photos affect people’s immediate decisions about recently performed actions”. Photos can suggest memories, which may be false; can this happen at once, or does it need time to cook? Is it possible, for example, to mistake a feeling of cognitive availability for truth and adjust recollection at once? She put subjects through a study phase with 40 unfamiliar animal names and instructions to either feed the animal or not (by clicking a button). Then after a filler task the subject is asked whether they gave food to the animal; half the time an animal photo is shown. This indeed worked, and the effect size was even greater if subjects had to do more than just click to do the feeding (but still only a few percent, either side of 50% – so people are more or less guessing). The effect size is significantly greater if “social” – if subjects are asked whether people fed an animal rather than whether they did – and if the animal name is unfamiliar (e.g. colocolo versus zebra).
Debbie Wright, “Inducing false confessions: The power of misleading images versus misleading text”. Debbie just passed her viva last week, and she’s worked on whether people are more likely to believe a false newspaper claim from a headline or a photograph. A number of false-evidence studies have shown that doctored photos can be effective, others that false text can be too. Johnson, Hashtroudi and Lindsay (93) hypothesise it’s a source monitoring error: we misattribute to longer-term memory anything that becomes familiar and is credible; this can be enhanced if it’s vivid or perceptually detailed. Photos can be a cognitive springboard, but text gives free rein to the imagination. So she’s done experiments on whether text or photos have more effect, and whether the order matters. In the first experiment, she told subjects to click on a driving hazard only when the light is green; if they click on red it’s cheating. Eventually they are falsely accused of cheating; controls were interviewed to see if they believed it; treatment subjects were shown photo or text “evidence”. Text worked to confirm false belief, photos no more than control, and the two together were in between. In the second experiment, she hypothesised that showing text first should be more effective: she found a strong anchoring effect whereby showing text first dominated regardless of whether text or a photo were shown second. Takeaway: text is persuasive, if it’s shown first.
Linda Henkel, “On second thought: memory distrust in young and older adults”. Polczyk’s studies suggest older adults show more changes and inconsistencies in memory than youngsters, e.g. using the Gudjonsson suggestibility scales, when the interviewer is unfriendly. But McMurtrie et al found less response change in older adults in response to negative feedback. So are older people changing their responses, or their memories? What about self-esteem, and confidence in cognitive abilities? She tested 57 community-dwelling oldsters with an average age of 78 against 43 undergrads by showing them a rogue government agent kidnapping a suspect, Jack Bauer style. There followed 4 filler questions, 8 nonleading questions and 8 misleading ones (both options wrong). They were given either no feedback, negative feedback (“you got a lot wrong”) or stereotype feedback (“people your age get a lot wrong”). Negative feedback causes response change on misleading questions equally among both groups, while stereotype feedback is effective only on the young; oldies ignore it. Recall accuracy did decline in older adults though with increasing age (they were 64–91).
Patrick Rich, “Correcting the continued influence of misinformation may require belief in the correction”. Wrong news reports are common, but corrections are not entirely effective. Johnson and Seifert 1994 studied a news story with initial misinformation (jewel theft victim’s son had a gambling debt) and later correction (he was out of town); it persisted. This led Patrick to wonder whether implied misinformation (as in this case) is more persistent. So he repeated the experiment, testing implied verus directly stated misinformation (son was periodically checking house, versus police suspect son because …). He retained data only from subjects who recalled at least one of two corrections. If there was no correction, there was no significant difference; if their was, then beliefs from implied smears are almost uncorrected, while directly stated misinformation was corrected in about half of subjects. Perhaps participants are unaware of implied influence, or perhaps those in the implied condition find their explanation more satisfying then the correction as it’s theirs. This is supported as 63% of the implied-condition subjects rejected the correction, only 53% of those in the direct condition did. Conditioning on belief in the correction, the implied-condition subjects still believe more in guilt. In questions, I suggested that the more effort people had to put into working out the inference, the harder it would be to shake it.
Jane Goodman-Delahunty, “Social persuasion in rapport development with high value interviewees”. This is funded by the FBI’s high-value detainee (HVD) interrogation group but not endorsed by them. Are people from Western societies more susceptible to rational-persuasion methods? Coercive strategies lead to unreliable evidence, so what else? Persuasion, kindness and rewards can have different effects in low versus high context cultures. She used Cialdini’s six basic compliance strategies to code data from a field study of interviewers of HVD suspects and witnesses: she asked 123 interrogators 56 questions about rapport in interviews. All techniques could be classified by Cialdini’s principles; similarity and humour often used, dissimilarity sometimes, and reciprocity also very often (both by over three quarters of interrogators); the other four points were far behind, being used by a quarter or less. In conclusion, the Cialdini framework appears valid in a new setting, across five different cultures and both civilian and military interrogators. Under interrogation, she confessed that only a small minority of her 123 interviewees supported coercive questioning methods.
Last keynote talk of the day was from John Hibbing entitled “Predisposed: The Deep Root of Political Difference”. Again, this is not strongly related to deception, but it was interesting and I’ll blog it anyway. Politics is explosive because of strong preferences; what are they and where do they come from? A traditional answer is that they’re idiosyncratic, changeable and culturally specific (Philip Converse’s views on this dominated the last half-century of political science research). The alternative answer from psychology is that political beliefs are stable and at the core of our being; there are red and blue brains. John tips to the latter view; it’s to do with how people feel, think and (crucially) remember. The bedrock principles are group leadership, protection from the out-group, punishment of in-group norm violators, distribution of group resources and orientation to change/procreation/new lifestyles (interracial marriage 50 years ago, gay marriage today).
Universality emerges at the level of stability vs reform factions, as Mill and Emerson argued; Emerson said it is likely that these irreconcilable differences have a depth of seat in the human condition”. (That summarises his talk.) Openness correlates with reform, conscientiousness with conservatives; agreeableness splits into politeness (right) and empathy (left). EPQ neuroticism, extraversion and impulsivity are uncorrelated between mate pairs; the top matches are church attendance, conservative political views, drinking frequency and political party support. Schwartz’s basic values also match up fairly nicely, as does Haidt’s work on moral foundations: conservative value loyalty, authority and sanctity. Jost found that conservatives were more likely to have sports decor and a calendar in their rooms while liberals have books, CDs and art supplies. Also, should poetry rhyme? Should art be realistic? should novels end with a clear resolution? There are pretty good correlations, that are universal across countries, but with modest effect sizes.
How can we organise these differences? He argues it’s about negativity bias, which is greater for conservatives. He tests electrodermal activity while showing a variety of images, both aversive and appetitive. Liberals have a flat response while conservatives show lower skin conductance change for appetitive images, and lower for aversive. This finding appears robust. There are also marked differences in blink (electromyographic) responses to auditory startle; conservatives blink harder. Amodio et al have found brain imaging differences a few years ago; Schreiber et al found differences in risk tasks this year. He now has his own fMRI project, showing 70+ subjects images that can be disgusting, threatening, neutral or positive; the disgusting can by contamination or mutilation. Conservatives show more activity in the rostromedial prefrontal cortext, involved in repressing emotions, when shown disgusting images. Liberals seem better at feeling pain in disgusting-mutilation images (they show S2 activation); but they’re affected just the same by disgusting-contamination images such as a cockroach on food, or vomit. Twin studies show MZ correlation at 0.65 versus 0.43 for DZ, suggesting a heritability of 0.44. The only study of allelic variation James Settle et al on Dopamine receptor D4; it’s suggestive rather than definite.
As for recent empirical work, Carraro, Castelli and Machiella have applied the Stroop test to politics: conservatives’ performance is more impaired by negative words, showing they pay more attention to the negative. His colleague Mike Dodd has been using eye trackers: do people’s eyes dwell on negative or positive images? Indeed, there’s a left-right difference with conservatives dwelling much longer on the negative. As well as dwell time, fixation time also showed greater negativity bias among conservatives. On the other hand, liberals show a gaze-cueing effect (where the contrary gaze direction of a cartoon figure distracts you from fixating on a target), while conservatives don’t (like autistics). Shook and Fazio’s beanfest game explores exploratory behaviour; liberals turn over many beans, while conservatives turn over a few, figure out a rule and stick to it; liberals do better on positive beans and conservatives on negative ones, sometimes misidentifying positive beans as negative ones.
To conclude, people on the left and right are different in deep ways of which they’re generally unaware. We should accept that people with whom we disagree are not merely lazily uninformed, but rather experience the world in a fundamentally different fashion. In fact one response to his research was “Don’t do this to me; I need to hate conservatives!” But just as people who accept that sexual orientation is partly genetic are more tolerant of same-sex attraction, the same should apply politics. Perceptions of the world are hugely influential; people on left and right retain differing types of information, with differing accuracy. And it can be useful in society to have a mixture.
Friday morning’s keynote is on “Acquiring Misconceptions: the role of Knowledge Neglect” by Beth Marsh. She’s interested in the persistence of misconceptions: how do false statements such as “Vitamin C wards off colds” enter into the human “knowledge base”? There are certainly many possible sources of error, from fiction and news media through our family to scientists’ misconceptions, but how do they stick? It’s known that people learn errors more easily in fields where they have no prior knowledge, but a more interesting case is where they actually have some. So she sets subjects a general-knowledge test, and two weeks later gets them to read stories with neutral or misleading references: 20% of subjects who previously got it right now get it wrong (and 11% of those who got it right with high confidence). See Fazio, Barber, Rajaram, Ornstein and Marsh 2013. There is also an illusion of prior knowledge: over 90% of people who give a false answer attribute it to the story but over 60% to general knowledge too. Gilbert’s theory (1991) was that people accept information as true by default; disbelieving it requires a second step. some situations can discourage monitoring (such as being embedded in a story), and even if monitoring is attempted it can fail if the match is close.
In her tests about a third of people miss errors contradicted by facts they earlier got right in a pre-test, and subjects will later be much more likely to get those facts wrong. She call this knowledge neglect. Speed of retrieval, familiarity, fluency are interpreted as cues to truth. The Moses illusion (Erickson, Mattson 81) asks subjects to spot tricky questions: “How many animals of each kind to Moses take on the Ark?” should not be answered, as it was Noah. Subject performance varies 30-75% with questions but the effect is robust. Does it work on subject experts? Indeed, according to her recent work on grad students; subject knowledge cuts error maybe from 40% to 30-33%.
One way to get rid of such errors is to get subjects to make explicit truth judgments in the initial phase; but once errors are missed they acquire fluency. Drawing attention to errors can backfire by increasing encoding of the errors; repetition increases suggestibility, as in the eyewitness literature. Reading more slowly has no effect on error detection and also increases suggestibility. Highlighting the dodgy facts in red does help error detection slightly but also increases suggestibility. You can think of this as a fluency war.
What about older adults? Recall goes down with age but vocabulary and knowledge base increase. It turns out that oldies are less likely to show illusions of knowledge than college students; we emphasise accuracy over speed and are better able to recover (Umanath and March 2012). This may be due to strong prior knowledge plus not being as good at encoding the false information. Bahrick points out that much of our knowledge is marginal; tip-of-tongue or otherwise inaccessible; maybe activating it will refresh it? She found that activating knowledge a week in advance has no effect but doing this three minutes before a test makes a real difference.
Does the retroactive interference paradigm explain all this as well as it does normal episodic memory errors? Maybe, maybe not. Differences include age effects, source tests and presentation speed; similarities include repeated exposure, retrieval practice and plausibility. So they’re not exactly the same. In questions, someone pointed out that taking flat-earth kids to a planetarium doesn’t work, perhaps because the students sit in a flat auditorium and the heavens go overhead; Beth agreed that wrong beliefs about the world, which people have worked out for themselves, can be extraordinarily hard to correct.
Session on “What happens to our memories when we lie?”
Danielle Polage, “Telling Lies Makes the Truth Less Certain.” Do liars come to believe their fibs? Many previous writers have mapped other inflation effects; her previous work on fabrication inflation showed this might explain lie belief. In her recent work she’s studied consistency, getting subjects to lie/tell the truth on day 1/2. 71 students completed both sessions in a 2×2 study lie/truth then lie/truth. Hypothesis: memory will shift towards the lie, certainty will decrease, and the effect of session 1 will be greater. Found: lying in session 1 was significant; interactions were not; consistent lying was significantly different from consistent truth telling; and denying true events significantly reduced certainty.
Cheryl Hiscock-Anisman, “Differential recall enhancement, ACID, and deception”. Detecting deception statically is hard as there are few reliable differences; Cheryl’s research program is to work through interviewing with differential recall enhancement, which gives honest responders the opportunity to enhance their responses with more details in a positive feedback system whereby recall leads to additional remembering. Liars who work from a rehearsed script trying to appear cooperative while avoiding contradictions; the trick is to use mnemonics that help the honest. Liars use more careful phrasing, have less detail and are shorter – her “assessment criteria indicative of deception” (ACID). She claims 89% efficacy, seen over 16 different studies (Colwell, Hiscock-Anisman, Fede 2013). In a new study, 56 students snuck into the faculty building, “stole” a wallet by moving it, got a week to prepare for interview, and were offered $100 for the most convincing story. The “mnemonics” drive a multiple-recall interview: mental reinstatement of context (sights, smells, …) ; three forced-choice questions (did someone have an accent? Was the gun nearer the door? …); recall from another perspective; another three forced-choice questions; reverse order recall; three more forced-choice questions (including whether they admit to any potential error); and finally retell the entire event. The questions get liars off balance as they’re unlikely to be in a lie script.
Donna Li, “Do lies become truths over time?” Modern cognitive theories of deception implicate memory (Walczyk, Igou, Dixon, Tcholakian 2013). Lying is a missed opportunity to rehearse the truth as well as self-generated misinformation. In the first case accuracy is impaired and in the second subjects will incorporate their own lies; also as it’s cognitively demanding they may remember the source. She set out to test these hypotheses. She showed suspects a crime video and asked 18 questions; one group was asked to lie to nine of them. After a distraction they had to answer truthfully. She found no impairment; liars made more commission errors, but the errors they made were novel ones rather than ones they’d made up; and liars did remember which questions they’d lied to. The experiment was rerun with a 1-week delay, and in this case the truth group did significantly better than a control group, showing a rehearsal effect, while there were no commission errors and liars could still remember lies. In conclusion, lying does not impair memory accuracy, but liars’ memories are fallible. This can be explained in the source monitoring framework. An implication might be that it’s unreasonable to require offenders to recall their offences accurately; they might not be able to.
Kevin Colwell, “DRE: The interaction among interviewing, memory and deception.” Kevin works with Cheryl (above); he studies the effects of forming lie scripts, and how do we exploit this when interviewing? In his latest study 60 undergrads were invited to imagine being mistreated by a professor or employer or parent. Truth-tellers were asked to give a witness, and five of them were called. They had fifteen minutes to prepare and the protocol described by Cheryl was used, after a rapport phase and a free recall phase. They misclassified one truthteller and three liars.
Amina Memon was rapporteur on the earlier talks. Trying to generate as much accurate information as possible is quite different from trying to create false memories; they have different timescales. Also, liars draw on a skill they’ve developed throughout their lives. Memory changes are complex, as Donna shows. As for DRE, it fits with strategic /tactical methods of questioning suspects, but are there circumstances where they could hinder deception detection? It’s also important whether the liar is masking a deceptive act or deceptively masking a true one. And work in the field is confounded by many factors like delay and rehearsal. Danielle’s work suggests that the well-rehearsed big lie can be best, as consistency matters more than the number of inaccurate details. Something we haven’t tested yet is what makes people better at lying: she suspects that time to prepare is a big factor, that we should look at more high-stakes lies (e.g. among the prison population) and at nonverbal cues.
Finally, Richard Kemp talked on “Lost in Translation”. Interviewing witnesses in a second language can induce retrieval-induced forgetting; he’s working with an agency in Australia that investigates workplace accidents. Early recall opportunities help (the testing effect) which has led to first-response interview tools such as Gabbert, Hope and Fisher’s Self Administered Interview. However an incomplete initial interview can cause retrieval-induced forgetting of non-reported items. He’s interested in whether doing the first interview without an interpreter can do harm. Bilinguals can have better episodic memory and executive control but poorer semantic memory (recalling a word list in the other language can increase errors). He did two sessions separated by a week on English/Cantonese bilingual students; accurate recall was much higher at first interview in the first language, but being interviewed in the second language was equivalent to not being interviewed at all. A second Spanish/English experiment on mTurk got US subjects with English as first language and learned Spanish; again, making an initial report in Spanish is the same as not making a first report. Conclusion: do the first report in the witness’s first language.
Session on the cognitive psychology of deception.
George Visu-Petra, “Executive functions involved in deception: An individual differences perspective”. In Miyake’s model, deception involves three executive functions: updating, inhibition and set-shifting; see also the RT-based concealed information test (Verscheure 2009). So: individual strengths in executive function should lead to different abilities at deception. With a mock crime RT-based CIT, effects were found for spatial working memory and inhibition but overall the evidence was mixed. Interference studies look at the effects of a parallel executive or social task; he found individual differences related to working memory, anxiety, depression and stress. The best detection had a positive virtual examiner, and the worst with an angry one. See Visu-Petra, Varga, Miclea and Visu-Petra 2013.
Evelyne Debye, “Delta plots reveal the role of response inhibition in lying”. As lying involves withholding the truth, it’s been hypothesised that response inhibition may be involved – which is supported by fMRI studies. She’s looking for more empirical evidence for this. The Sheffield lie test finds longer reaction times (if text yellow, tell the truth; if blue, lie). The Simon task creates conflict between slow deliberate and fast direct routes, allowing the measurement of activation suppression. Smaller Simon effects show greater inhibitory skills in delta plots. Do lie effects behave the same way? She did the Sheffield lie test on 54 students who were classified into small and large lie-effect groups; the former had a flatter or even negative delta plot, confirming the response inhibition hypothesis. In questions, someone suggested conflicting rather than inhibited responses as an explanation; the author acknowledged that this is an ongoing controversy but claimed the balance of empirical evidence favoured inhibition.
Nicholas Duran, “The temporal unfolding of response competition during deception”. Nicholas takes a dynamical-systems approach and explores response competition. A traditional cognitive-psychology approach sees thinking as a discrete logical process with a response time; he sees dynamics as continuous (Duran & Dale 2008). He designs experiments to elucidate the dynamics. For example, you click on a button at the bottom of the screen to reveal one word after another in a question, and at the end a colour button tells you to answer true or false; the experimenter examines the mouse trajectory over the next second or so; this not only shows how a “yes” changes to a “no” as the subject realises she has to lie, but lets the experimenter map the attractor landscape. Thus he can map truth bias by fine-grained comparison of a false no response with a false yes (which has greater competition). As well as reaction time, you measure complexity, deviation and timing with PCA. He claims 69.6% correct discrimination with all these additional factors versus less than 30% with reaction time alone. Studies range from simulating intention to lie, to spontaneous cheating. In this last case he follows Greene and Paxton’s 2009 “Moral decisions” by asking mTurkers to predict head/tail outcomes in random binary sequences. “Make a guess: heads or tails – write your answer on a piece of paper before pressing go!” The reward is 5–10c, and the trajectory is recorded. People who get 65% or more are taken as dishonest; dishonest-correct answers have a significantly different trajectory. Conclusion: low-level dynamics can help analyse high-level problems.
Nobuhito Abe, “Neural reward sensitivity predicts dishonest behavior”. He has two hypothesis of honest behaviour: “Will” and “Grace”, respectively resistance to temptation versus its absence. His hypothesis is that dishonest behaviour comes from reward sensitivity, and he presents high or low ($5 or 25c) reward or punishment. He measures ventral striatum activation as a signal of reward sensitivity, then does the Greene and Paxton coin flip task described in the previous talk. he found significant correlation between reward sensitivity and dishonest behaviour; between the frequency of dishonest behaviour and dorsolateral prefrontal activity associated with honest decisions; and between this and reward sensitivity. So: low reward sensitivity makes people morally graceful, but people with high sensitivity can resist this by force of will.
Bruno Verschuere, “Learning to Lie”. The cognition-based view is dominating the field now at the expense of the old stress model. But experimental lies are different: important real-life lies are often repeated over years. And people get better at stuff they practice! treat 50% truth/lie (the usual experimental condition) as the control, and measure frequent lying (and truth-telling). For frequent liars, the usual symptoms in error rate and latency vanished. Longer-term, the more you tell the truth, the harder it is to lie (the truth proportion effect); there were no effects of “lie training” but he expects this will take months or years to take effect. This suggests boundaries to the cognition-based lie detection. These studies are about acquiring a skill, not faking.
Friday’s last refereed paper session was a miscellany.
Alice Healy, “How much is remembered as a function of presentation modality?” Errors in communication between pilots and air traffic control can have severe effects; Alice works for NASA and her mission is to minimise errors. At present pilots hear their orders; as airborne data becomes available, should we present them as text, pictures or a combination? She asked students to move in a 4x4x4 grid by various modalities and 1–6 commands at a time. Various comparisons of read, hear, see, in parallel and series, were tried. Single-mode, see was better than hear was better than read; dual mode, see-see, hear-read and see-read were best. In general subjects did better with messages repeated rather than mixed. There were also practice effects: subjects improved more at reading, which ended up best by the end. Conclusion: repetition and practice are much more important than modality for transmitting air navigation instructions.
Angie Birt, “The effects of bilateral saccadic eye movements on memory for emotional scenes.” It’s widely believed that REM sleep is important for memory consolidation, and the majority of BSEMs occur during REM. Christman 2003 found that episodic memory is enhanced by even 30sec BSEM before recall. Now eye movement desensitisation & reprocessing (EMDR) is a common treatment for PTSD; inducing it while the traumatic event is recalled is supposed to make the memory verbalisable. She investigated whether eye movements increase accuracy when they happen before, during or after encoding, or after retrieval. Before encoding was best; movements during encoding and retrieval seem distracting; positive valence gave the strongest response; and the overall effect may be less than reported by previous studies. Maybe EMDR works by distracting people from the memory.
Carla MacLean, “Investigating Investigators: Tunnel Vision and Investigation Protocol.” Carla investigates workplace accidents where human bias (the fundamental attribution error) and confirmation bias can give investigators tunnel vision. Det Norske Veritas trains them to use a cause analysis chart, which she assessed. She did a bias study where there were two clues of human error and two of equipment failure; half the subjects were given a human-error bias and half a failure bias. In all conditions the cause analysis methodology reduced the bias.
Kate Houston, “Developing a psychological model of interrogation”. Kate is interested in distinguishing true and false confessions. She uses confederates to set subjects up to cheat or not at academic tasks, then interrogates them. Confession models are decision-making (cost-benefit analysis), psychoanalytic (need to confess), social pressure (common within accusatory interrogations in the USA), and the cognitive behavioural model (none of these is enough). Surveys of prisoners suggested false confessions come from external pressure and consequences; true confessions from internal pressure and proof. She wanted to see if she could replicate these results, and used structural equation models to analyse guilty versus innocent confessions. Significant findings included the correlation of affect, guilt and evidence strength with guilty confessions; interrogation pressure with innocent ones and perceived consequences with both. She suggested that innocent suspects may not realise how much trouble they’re in, so are less stressed. She concludes that we should model true and false confessions differently.
Dorthe Berntsen gave the afternoon keynote on “Involuntary and voluntary remembering of trauma: Key assumptions of Posttraumatic Stress Disorder (PTSD) evaluated in the light of autobiographical memory research”.
In this talk, Dorthe reported a decade of work that demolishes the orthodox view of post-traumatic stress disorder. Traumatic memories do suddenly intrude; for example, a survivor of a 1945 bombing of Copenhagen reacted to 9/11 by resurrecting her wartime memory in detail. How are such memory intrusions triggered? The conventional view was Mardi Horowitz (1975): faulty encoding of traumatic memories made them hard to access and made their recall intrusive and involuntary, requiring suppressive effort. Voluntary recall is reduced by defensive mechanisms too. This influenced the DSM’s introduction of PTSD in 1980: two criteria are reduced voluntary / enhanced involuntary recall. She challenges this in detail.
First, she has much evidence against faulty encoding. With David Rubin, she introduced the “centrality of event scale” to measure how accessible and vivid memories are reference points for everyday inferences, turning points in life narrative and components of personal identity. Since she introduced CES in 2006, there is robust positive correlation with PTSD, and that positive and negative events are about equally integrated. Indeed people have more difficulty forgetting central aspects of the trauma with increasing levels of PTSD symptoms.
Second, she has much evidence that voluntary and involuntary recall follow the same pattern: emotion at encoding enhances both voluntary and involuntary recall. She did a study of aversive picture recall; the more intense the reaction, the more recall of all types. Ferree and Cahill replicated this (2009) with emotional films. Structured diary studies of involuntary memories tell the same story: voluntary and involuntary recall follow the same pattern in PTSD patients, depressed patients and controls. Emotion of either valence at the time of memory encoding enhances recall, full stop.
Third, she presents an alternative view of persistent intrusive events, based on recent autobiographical memory research. Involuntary memories are common rather than exotic; most studies have shown they’re more common than voluntary ones. The basic mechanism is similar in terms of emotional content, forgetting rate and the dominance of visual imagery. Positive events dominate among nondepressed individuals. Involuntary memories are usually cued by a shared feature with the retrieval situation; they have more mood impact or “flashback quality”. As they are triggered by the recall environment rather than by choice they seem effortless; both types use the same wiring, according to fMRI studies, but the involuntary ones bypass the recall effort. A key is cue item discriminability, which is associated with uniqueness: a unique cue plus a unique scene gives the strongest involuntary recall. However, a highly emotional event such as an assault can end up being tied to a nonunique cue, such as the smell of cigarettes and beer in a case she discussed – leading to multiple cueing. She found this emotional effect increases over the days after a traumatic event, and can break away from the interference processes that otherwise stop common cues from firing memories up.
In summary, stressful events are not disintegrated in autobiographical memory, nor is recall differentially affected. The observed phenomena can be explained by mechanisms we now understand.
In questions, someone asked how the PTSD community reacts to all this work? Their reaction has been mixed, but the two fields are starting to listen to each other.
Heather Buttle, “Emotional content and affective context”. How does mood interact with the emotional content of autobiographical memory? She played music while subjects did a memory task, recalling sad or happy faces; with a short piece of music happy music brought faster recall, while with longer intense music it unexpectedly brought slower recall. In all cases happy faces were recalled slightly faster.
Gabriel I. Cook, “Perceptual Fluency and Valence Influence JOLs and Recall Differently for Pure and Mixed Lists”. We don’t pay as much attention to information we reckon we’re going to forget; metamemory is about feelings of knowing and judgments of learning (JOL). Various manipulations can influence our JOLs, including ease of learning and perceptual fluency. Castell found that large font can increase JOL but not recall while emotionality increases JOL and free recall, but not cued recall. Neutral items in lists with some emotional items are recalled less; they are less distinctive. Gabriel’s first experiment was whether the perceptual fluency effect persists in the presence of emotion for mixed lists (emotional and neutral words). He found that font size and emotionality both increase JOL, contrary to Castell’s findings. His second was whether emotionality cues recall in pure lists; it doesn’t. Font size plays the significant role, and subjects overestimate its effect.
Lauren Knott, “The Impact of negative emotional stimuli on Bilinguals’ False Memories”. Lauren is interested in how emotion affects false memory production; she uses the DRM paradigm whereby you present a number of words associated with an absent target word, and see if the subjects produce the target at recall. This is robust and works in different languages; and Marmolejo et al have shown that false recall is more common, and confidence is higher if the list is presented in one language and recalled in another. The current explanation is that people create a shared mental lexicon across languages; children don’t have this yet and only do in-language false memory. But no study has looked at false recall of emotion-laden stimuli, despite the problem of bilingual witnesses exhibiting detachment and emotional distance in their second language. She did a 2×2 Urdu/English encoding/retrieval study, and found that both true and false recall were lower cross-language. She’d expected that reliance on “gist” would lead to more cross-language false memory, but this did not happen. She suggests that emotion lexicons have partial or separate conceptual stores in bilinguals. In questions it was suggested that there might be an effect of language distance: cross-language false memories are higher in English to Spanish, but not in English to Urdu or English to Japanese.
Misia Temler, “Was it a Red Shirt or a Blue Shirt? Social contagion contaminates concrete details.” Witnesses are held to be credible if their testimony remains consistent, so researchers have developed paradigms of how this can be influenced by others: collaborative recall, the misinformation effect, and the social contagion paradigm (which Misia uses, from Roediger, Meade and Bergman 2001). When people recall a scene with a confederate they contaminate each others’ accounts. Does this work for events that were not shared with the conversational partner? Barnier and others (2007) found that evaluative feedback could indeed influence personal and rehearsed memory, but subtly, such as by inflating aspects that people approved of. Misia asked whether concrete details could be changed, such as the colour of a shirt? As with Barnier, she got 50 participants write accounts of four significant life events, such as a first date, then had a collaborative phase where the confederate suggested altered versions of one of the six most important points of two of them. She found that 20% of subjects later recalled a suggested false memory. As far as the subjects knew, the confederate was just another undergrad. There was also plagiarism from the confederates’ scripts. Might high-stakes situations be different?
Saima Noreen, “To think or not to think, that is the question: Suppression and rebound effects in autobiographical memory”. Anderson and Green (2001, Nature 410 366-9) found that people can be trained to forget neutral material. She’s repeated this for personally meaningful autobiographical events. But how stable is this over time, and what happens to memories we try but fail to forget? 24 participants recalled memories, did the Anderson-Green think/not-think task a fortnight later, and were retested after a year. Individuals who were poor at the early test did indeed recall more in the delayed test. Those who’d performed well, however, later showed equivalent recall of baseline and suppressed items. In questions, she remarked that she sees suppression and repression as similar.
Charles B. Stone, “Induced forgetting and reduced confidence in our personal past.” Charles is interested in the confidence people have in their autobiographical memories; this can come from intensity, detail, metamemory or retrieval fluency. But non-retrieved memories have a different mnemonic trajectory. Rehearsed items are remembered best; unrelated items come next; but items related to rehearsed items, but not rehearsed, are worst (this is retrieval-induced forgetting, Anderson, Bjork and Bjork 1994). This is robust across a range of stimuli. Is it robust to emotion? He elicited autobiographical memories of assorted valence, learned cues for them, then half of them had selective practice, then a final cued recall the next day. He found the standard RIF effect, but subjects’ confidence was nonlinear: after selective practice they were more confident in their recall of the masked items when they were positive than negative. Perhaps with negative memories people use retrieval fluency less, or metamemory more. We need to study the difference between positive and negative memory more.
The final session I could attend before heading to the airport was called “This symposium did not really occur: Recent advances in the study of nonbelieved memories”. This is the second “nonbelieved memories” workshop at SARMAC, and was very well attended.
Henry Otgaar, “Experimentally inducing nonbelieved true and false memories using an imagination inflation procedure”. Nonbelieved memories are memories you still have but don’t believe any more, and were until recently thought to be rare. Yet it is possible to induce them in the lab. Mazzoni et al (2010) found that 20% of people reported one; they have a memory that’s implausible, or that was contradicted by another witness, or by evidence. Henry’s most interested in the second case, with kids and adults, and in previous work gave them a false memory of a hot-air balloon ride. He then debriefs subjects and finds that some are very surprised. Even a month later, they are phenomenologically similar to true memories. Also, you can get people to disbelieve true memories. Omission errors are harder to produce than false memories. His new (imagination inflation) method is an encoding session on day 1 where they listen to action statements; a third then imagine and a third perform. The next day they imagine and perform. On day 15 they’re tested and asked if they did X, imagined it or performed it. In some cases they got (true or false) feedback to the effect they just imagined it. They’re then tested on detail, vividness etc. So (imagined, performed, imagined) gives a nonbelieved false memory and (performed, performed, imagined) a nonbelieved true one. 6% of actions resulted in the former and 9% the latter, versus 12% false memories and 34% true memories. With 8-year-olds it’s 6%, 8%, 8% and 20%. False memories are very much like true memories. But most people still resist the attempt to induce false memories of both types.
Rob Nash, “Producing nonbelieved false and true memories of recent experiences”. He wants really strong false memories, so rather than imagination inflation he uses doctored evidence. Subjects think they’re doing work on behavioural mimicry, and mimic the research assistant. This is video recorded and tampered; the RA later does different actions, which get spliced in and shown to the subject. A few days later they mostly believe they mimicked those too. This was reported in Clark, Nash, Fincham and Mazzoni 2012. Now he reports the debriefing stage: they told the subjects they’d been manipulated and asked them to re-rate their memories. The debriefing basically worked for belief but not so much for memory, giving evidence for creation of nonbelieved memory. Then they had true videos and fake debriefings: belief ratings don’t now go to baseline, but go down significantly, while memory ratings go down (significantly but less then belief, and less than in the false-memory case). Again, there are no robust subjective differences between either type of false memory and true memories; it’s just that they’re believed less.
Chantal Boucher, “Reasons for withdrawing belief in the occurrence of autobiographical memories”. Chantal found 374 undergrads online with nonbelieved memories and asked them why they stopped believing. The reasons ranged from plausibility (22%) and alternate attributions (16%) such as substance abuse, to social feedback (32%). This was consistent with previous literature. Social feedback is even more prominent among single causes as opposed to overlapping ones. Most of these causes are not tied to recollection.
Alan Scoboria, “Social influence and the evaluation of belief in the occurrence of memories”. The key feature of a nonbelieved memory is the conscious decision to stop believing it after it’s challenged. Sheen, Kemp and Rubin report twins fighting over memories; sometimes we defend and sometimes we relinquish. The most frequent category in the data is being told by another person that an event did not happen. When memory conflicts with social feedback, we get a cognitive dissonance which is also a relational dissonance, and we can’t resolve both. He has a case of a 68-year-old with a strong memory of an event at age 8 that was contradicted when he was 15; the cognitive dissonance still bothers this man. People can try to reattribute it; “maybe it was a dream” or appraise the messenger, but eventually they have to defend or relinquish the memory and decide whether or not to say. If the stakes are high they may keep quiet, but such strategic regulation may not be best for a relationship. In a sibling relationship, for example, you might not want your brother ever to be right, so you’d deny his rebuttal of your memory but still believe it less. And this isn’t a static thing, but goes round and round a feedback loop in the context of a relationship, so this gives insight into the maintenance of autobiographical memory and belief.
The discussant of this session’s papers was Steve Lindsay. Disbelieved memory is a hot topic but we should see it in a wider context. Frederick Bartlett described remembering in a social/cultural context, and his book on remembering is still well worth reading. There’s other early research on people attributing their own acts to their partners. It seems easier to get people to believe they did things they didn’t than the other way round; this goes back to the reality monitoring literature (such as Johnson, Raye and Durso 1980) of a generation ago. To test a hypothesis that an event was seen, people try to imagine it; if they’ve already done that, it’s easier and this may mislead. Suggested non-occurence is rarer in the literature but he found Pezdek and Roe 1997, who tested this in children but did not find a significant effect in their experiment. Wright, Loftus and Hall found something like RIF in 2000: poorer recall of non-reviewed actions. Bass and Davies’ controversial book “The Courage to Heal”, for sex-abuse survivors, preaches that “you do not have to remember”; you should be able to accept that such things happened even if you can’t remember them. The new work is really moving things forward; we can see that belief and remembering can be analysed separately. In questioning about ethics, he said that it was not in general OK to allow the implantation of painful false memories; a nonexistent hot air balloon ride is probably OK but a false memory of sexual abuse definitely isn’t.