Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

EUR -
AED 3.826499
AFN 71.030385
ALL 98.234927
AMD 406.051041
ANG 1.878997
AOA 949.047971
ARS 1045.411373
AUD 1.601946
AWG 1.877819
AZN 1.810508
BAM 1.957464
BBD 2.10499
BDT 124.584717
BGN 1.956582
BHD 0.392594
BIF 3079.692033
BMD 1.041786
BND 1.405228
BOB 7.204228
BRL 6.064131
BSD 1.042586
BTN 88.001624
BWP 14.243314
BYN 3.411982
BYR 20418.998737
BZD 2.101537
CAD 1.455901
CDF 2989.924956
CHF 0.931482
CLF 0.036922
CLP 1018.793624
CNY 7.549924
CNH 7.561989
COP 4591.013927
CRC 531.051461
CUC 1.041786
CUP 27.60732
CVE 110.357759
CZK 25.365386
DJF 185.660508
DKK 7.458453
DOP 62.833416
DZD 139.605459
EGP 51.746847
ERN 15.626785
ETB 127.633542
FJD 2.372094
FKP 0.822299
GBP 0.831371
GEL 2.838858
GGP 0.822299
GHS 16.472241
GIP 0.822299
GMD 73.966946
GNF 8986.553448
GTQ 8.047842
GYD 218.118569
HKD 8.109598
HNL 26.346398
HRK 7.431327
HTG 136.856345
HUF 411.801155
IDR 16576.320278
ILS 3.85605
IMP 0.822299
INR 87.989581
IQD 1365.76107
IRR 43864.385089
ISK 145.59978
JEP 0.822299
JMD 166.091199
JOD 0.738731
JPY 161.186643
KES 134.907469
KGS 90.113284
KHR 4197.628956
KMF 489.274588
KPW 937.60669
KRW 1465.385989
KWD 0.320724
KYD 0.868851
KZT 520.570046
LAK 22901.01833
LBP 93362.714409
LKR 303.437961
LRD 188.182689
LSL 18.813494
LTL 3.076122
LVL 0.630166
LYD 5.091279
MAD 10.488116
MDL 19.01644
MGA 4866.253709
MKD 61.658736
MMK 3383.679153
MNT 3539.987582
MOP 8.359127
MRU 41.482868
MUR 48.807541
MVR 16.095338
MWK 1807.880312
MXN 21.356346
MYR 4.654178
MZN 66.570455
NAD 18.813494
NGN 1764.774994
NIO 38.362613
NOK 11.56828
NPR 140.803079
NZD 1.785806
OMR 0.401048
PAB 1.042611
PEN 3.953361
PGK 4.197528
PHP 61.395037
PKR 289.519228
PLN 4.339611
PYG 8138.919113
QAR 3.802196
RON 4.979943
RSD 117.093556
RUB 107.31657
RWF 1423.230418
SAR 3.911199
SBD 8.719245
SCR 15.664754
SDG 626.631014
SEK 11.524749
SGD 1.404442
SHP 0.822299
SLE 23.528703
SLL 21845.729118
SOS 595.820821
SRD 36.977176
STD 21562.859595
SVC 9.122668
SYP 2617.517551
SZL 18.806988
THB 35.991618
TJS 11.103399
TMT 3.656668
TND 3.312216
TOP 2.439968
TRY 35.985198
TTD 7.08102
TWD 33.928352
TZS 2768.398477
UAH 43.131253
UGX 3852.274922
USD 1.041786
UYU 44.337267
UZS 13375.242263
VES 48.195778
VND 26492.609075
VUV 123.682886
WST 2.908239
XAF 656.530889
XAG 0.033358
XAU 0.000386
XCD 2.815478
XDR 0.793093
XOF 656.508814
XPF 119.331742
YER 260.339
ZAR 18.845585
ZMK 9377.327687
ZMW 28.800899
ZWL 335.454554
  • CMSC

    0.0700

    24.71

    +0.28%

  • GSK

    0.2900

    33.99

    +0.85%

  • NGG

    1.0296

    63.11

    +1.63%

  • RIO

    -0.4250

    62.145

    -0.68%

  • SCS

    0.1950

    13.235

    +1.47%

  • CMSD

    0.0500

    24.495

    +0.2%

  • RBGPF

    -0.5000

    59.69

    -0.84%

  • BTI

    0.2150

    37.195

    +0.58%

  • AZN

    1.9850

    66.245

    +3%

  • RYCEF

    0.0100

    6.8

    +0.15%

  • BP

    0.0900

    29.61

    +0.3%

  • BCC

    3.7100

    144.07

    +2.58%

  • JRI

    0.0600

    13.29

    +0.45%

  • RELX

    0.9750

    46.735

    +2.09%

  • BCE

    0.1300

    26.81

    +0.48%

  • VOD

    0.1373

    8.735

    +1.57%

AI systems are already deceiving us -- and that's a problem, experts warn
AI systems are already deceiving us -- and that's a problem, experts warn / Photo: OLIVIER MORIN - AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

F.A.Dsouza--DT