AI systems are already deceiving us -- and that's a problem, experts warn

Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

Dubai 28°C

AED 3.826499

AFN 71.030385

ALL 98.234927

AMD 406.051041

ANG 1.878997

AOA 949.047971

ARS 1045.411373

AUD 1.601946

AWG 1.877819

AZN 1.810508

BAM 1.957464

BBD 2.10499

BDT 124.584717

BGN 1.956582

BHD 0.392594

BIF 3079.692033

BMD 1.041786

BND 1.405228

BOB 7.204228

BRL 6.064131

BSD 1.042586

BTN 88.001624

BWP 14.243314

BYN 3.411982

BYR 20418.998737

BZD 2.101537

CAD 1.455901

CDF 2989.924956

CHF 0.931482

CLF 0.036922

CLP 1018.793624

CNY 7.549924

CNH 7.561989

COP 4591.013927

CRC 531.051461

CUC 1.041786

CUP 27.60732

CVE 110.357759

CZK 25.365386

DJF 185.660508

DKK 7.458453

DOP 62.833416

DZD 139.605459

EGP 51.746847

ERN 15.626785

ETB 127.633542

FJD 2.372094

FKP 0.822299

GBP 0.831371

GEL 2.838858

GGP 0.822299

GHS 16.472241

GIP 0.822299

GMD 73.966946

GNF 8986.553448

GTQ 8.047842

GYD 218.118569

HKD 8.109598

HNL 26.346398

HRK 7.431327

HTG 136.856345

HUF 411.801155

IDR 16576.320278

ILS 3.85605

IMP 0.822299

INR 87.989581

IQD 1365.76107

IRR 43864.385089

ISK 145.59978

JEP 0.822299

JMD 166.091199

JOD 0.738731

JPY 161.186643

KES 134.907469

KGS 90.113284

KHR 4197.628956

KMF 489.274588

KPW 937.60669

KRW 1465.385989

KWD 0.320724

KYD 0.868851

KZT 520.570046

LAK 22901.01833

LBP 93362.714409

LKR 303.437961

LRD 188.182689

LSL 18.813494

LTL 3.076122

LVL 0.630166

LYD 5.091279

MAD 10.488116

MDL 19.01644

MGA 4866.253709

MKD 61.658736

MMK 3383.679153

MNT 3539.987582

MOP 8.359127

MRU 41.482868

MUR 48.807541

MVR 16.095338

MWK 1807.880312

MXN 21.356346

MYR 4.654178

MZN 66.570455

NAD 18.813494

NGN 1764.774994

NIO 38.362613

NOK 11.56828

NPR 140.803079

NZD 1.785806

OMR 0.401048

PAB 1.042611

PEN 3.953361

PGK 4.197528

PHP 61.395037

PKR 289.519228

PLN 4.339611

PYG 8138.919113

QAR 3.802196

RON 4.979943

RSD 117.093556

RUB 107.31657

RWF 1423.230418

SAR 3.911199

SBD 8.719245

SCR 15.664754

SDG 626.631014

SEK 11.524749

SGD 1.404442

SHP 0.822299

SLE 23.528703

SLL 21845.729118

SOS 595.820821

SRD 36.977176

STD 21562.859595

SVC 9.122668

SYP 2617.517551

SZL 18.806988

THB 35.991618

TJS 11.103399

TMT 3.656668

TND 3.312216

TOP 2.439968

TRY 35.985198

TTD 7.08102

TWD 33.928352

TZS 2768.398477

UAH 43.131253

UGX 3852.274922

USD 1.041786

UYU 44.337267

UZS 13375.242263

VES 48.195778

VND 26492.609075

VUV 123.682886

WST 2.908239

XAF 656.530889

XAG 0.033358

XAU 0.000386

XCD 2.815478

XDR 0.793093

XOF 656.508814

XPF 119.331742

YER 260.339

ZAR 18.845585

ZMK 9377.327687

ZMW 28.800899

ZWL 335.454554

CMSC

0.0700

24.71

+0.28%
GSK

0.2900

33.99

+0.85%
NGG

1.0296

63.11

+1.63%
RIO

-0.4250

62.145

-0.68%
SCS

0.1950

13.235

+1.47%
CMSD

0.0500

24.495

+0.2%
RBGPF

-0.5000

59.69

-0.84%
BTI

0.2150

37.195

+0.58%
AZN

1.9850

66.245

+3%
RYCEF

0.0100

6.8

+0.15%
BP

0.0900

29.61

+0.3%
BCC

3.7100

144.07

+2.58%
JRI

0.0600

13.29

+0.45%
RELX

0.9750

46.735

+2.09%
BCE

0.1300

26.81

+0.48%
VOD

0.1373

8.735

+1.57%

AI systems are already deceiving us -- and that's a problem, experts warn / Photo: OLIVIER MORIN - AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

TECHNOLOGY 10.05.2024

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

F.A.Dsouza--DT

Dubai Telegraph - AI systems are already deceiving us -- and that's a problem, experts warn

AI systems are already deceiving us -- and that's a problem, experts warn

Featured

Chimps are upping their tool game, says study

The first 'zoomed-in' image of a star outside our galaxy

Historic gold regalia returned to Ghana's king

Endometriosis linked to slightly higher risk of early death