Legal AI Might Be Accurate… And Still Not *Right* – Above the Law

Everyone
knows
about
hallucinations.
Well,
apparently
not

everyone,
which
is
why
hallucinations
provide
so
much
amusement.
Lawyers
keep
putting
them
into
their
briefs
and,
sometimes,
lying
about
it
when
caught.

Judges
are
even
getting
in
on
the
action
with
hallucinations
of
their
own.
The
plague
of
hallucinations
remains
the
most
discussed
AI
threat
for
lawyers.

But
one
AI
weakspot
that
gets
almost
no
attention
—
despite
being
arguably
more
dangerous
—
is
the
case
where
AI
is
both
perfectly
accurate
and
fundamentally
incomplete.

Hallucinations,
while
pernicious,

should
be
caught
by
a
human.
Sending
a
brief
out
the
door
without
cite
checking

is
a
you
problem,
not
an
AI
problem.
Sure,
the
AI
hype
cycle
and
seductively
confident
interface

may
be
making
lawyers
dumber,
but
to
borrow
from
Smokey
the
Bear:
only
YOU
can
prevent
yourself
from
stupidity.

But
incompleteness
arises
under
a
whole
different
set
of
circumstances.
It’s
one
thing
to
search
a
few
hundred
cases
for
helpful
precedent,
and
another
to
scour
millions
of
documents
to
make
sure
there’s
nothing
harmful
in
there.
This
is
work
that
humans
simply
can’t
manage
on
their
own
and
there’s
no
equivalent
to
cite-checking
when
the
whole
assignment
is
to
“prove
a
negative.”
If
AI set
to
that
task
misses a
document,
it’s
an
“unknown
unknown.”

And
missing
a
lone
prior-art
document
buried
in
the
weeds
can
mean
millions
in
patent
litigation.

A new
case
study delves
into
this
risk
of
unknown
unknowns
and
how
to
get
ahead
of
it.

Melange,
a
patent
analytics
company,
set
out
to
build
patent
search,
monitoring,
and
mapping
tools
to
aide
clients
in
high-stakes
intellectual
property
issues.
Finding
those
obscure
prior
art
gems
hidden
within
hundreds
of
millions
of
global
patent
filings,
machine-translated
foreign
documents,
obscure
academic
papers,
and
technical
manuals
can
have
massive
repercussions
in
litigation.
The
average
cost
of
patent
litigation
runs
between
$2.3
million
and
$4
million
and
the
average
damages
clock
in
around
$24
million.
A
single
missed
prior-art
document
can
materially
shift
those
numbers.

“When
we
find
that
one
killer
piece
of
prior
art
that
the
customer
thinks
might
win
their
case,
we
lock
in
the
customer
for
life,”
Melange
CEO
Joshua
Beck
says.

What
Melange
discovered
is
that
the
primary
risk
in
this
work
isn’t
model
quality
or
the
dreaded
hallucinations,
but
infrastructure
reliability.
Scaling
AI
search
to
deal
with hundreds
of
millions
of
global
patents
and
technical
papers,
takes
a
lot
and
a
self-hosted
system
will
struggle
with
incomplete
recall
and
downtime.
And
when
Melange
sought
to
scale
from
a
manageable
40
million
documents
to
the
full
global
patent
corpus
of
roughly
450
million,
they
identified
this
precise
problem.
Those
drawbacks
could
cost
a
client
money.

So
they
hooked
up
with

Pinecone,
a
vector
database
provider,
to
address
the
underlying
infrastructure.
Using
AI
for
massive
searches
is
limited
by
recall.
“Intelligence,
provided
by
the
LLM,
is
a
‘frozen’
reasoning
engine
that
knows
how
to
think
and
process
language,”
Pinecone
CEO
Ash
Ashutosh
explained.
“Knowledge,
on
the
other
hand,
represents
the
dynamic,
factual
state
of
the
world
that
the
LLM
must
draw
upon.
Without
infrastructure
that
derives
knowledge
from
proprietary
data,
even
the
most
intelligent
model
is
prone
to
costly,
inaccurate,
and
incomplete
conclusions.”
An
infrastructure
that
supports
these
insanely
high
levels
of
recall
are
what
keeps
“index
structures
stable
and
query
performance
predictable,”
as
the
study
notes.
For
most
tasks
a
90
percent
recall
is
all
well
and
good,
but
when
seeking
out
prior
art,
a
10
percent
failure
rate
isn’t
acceptable.
With
Pinecone’s
help,
Melange
scaled
beyond
600
million
documents
without
reliability
issues.

In
the
human
quest
to
be
distracted
by
shiny
objects,
we’re
obsessed
with
debating
the
merits
of
these
new
algorithms
and
pointing
and
laughing
at
the
hallucinations.
“Lawyers
should
stop
focusing
solely
on
the
brain
(the
model)
and
start
asking
about
the
nervous
system
(the
infrastructure),”
Beck
explained.
“Can
this
system
accurately
and
reliably
scale
to
the
full
universe
of
data
without
degrading?”
If
your
vendor
can’t
answer
that
clearly,
well,
that is your
answer.

Of
course
patent
litigation
is
just
one
context.
Run-of-the-mill
discovery

faces
the
same
problems.
At
conferences,
attendees
chatter
about
new
context
limits
and
clever
workarounds
slowly
chipping
away
at
the
problem,
but
building
infrastructure
capable
of
handling
the
load
is
a
key
part
of
the
equation.
Because
the
model
itself
may
be
“accurate”
but
if
it’s
incomplete
it’s
still
not

right.

Millions
at
Stake:
How
Melange’s
High-Recall
Retrieval
Prevents
Litigation
Collapse [Pinecone]

Joe
Patrice is
a
senior
editor
at
Above
the
Law
and
co-host
of

Thinking
Like
A
Lawyer.
Feel
free
to email
any
tips,
questions,
or
comments.
Follow
him
on Twitter or

Bluesky
if
you’re
interested
in
law,
politics,
and
a
healthy
dose
of
college
sports
news.
Joe
also
serves
as
a

Managing
Director
at
RPN
Executive
Search.

+263 242 744 677

4 Gunhill Avenue,

Legal AI Might Be Accurate… And Still Not Right – Above the Law