dartos

joined 2 years ago
[–] [email protected] 2 points 2 years ago

Indexing and tools like llamaindex use LLM generated embeddings to “intelligently” search for similar documents to a search query.

Those documents are usually fed into an LLM as part of the prompt (eg. context)

[–] [email protected] 5 points 2 years ago* (last edited 2 years ago) (2 children)

Yes, you can craft your prompt in such a way that if the llm doesn’t know about a referenced legal document it will ask for it, so you can then paste the relevant section of that document into the prompt to provide it with that information.

I’d encourage you to look up some info on prompting LLMs and LLM context.

They’re powerful tools, so it’s good to really learn how to use them, especially for important applications like legalese translators and rent negotiators.

[–] [email protected] 9 points 2 years ago* (last edited 2 years ago) (5 children)

Generally, training an llm is a bad way to provide it with information. “In-context learning” is probably what you’re looking for. Basically just pasting relevant info and documents into your prompt.

You might try fine tuning an existing model on a large dataset of legalese, but then it’ll be more likely to generate responses that sound like legalese, which defeats the purpose

TL;DR Use in context learning to provide information to an LLM Use training and fine tuning to change how the language the llm generates sounds.

[–] [email protected] 10 points 2 years ago* (last edited 2 years ago)

Looks like they got that number from this quote from another arstechnica article ”…OpenAI admitted that its AI Classifier was not "fully reliable," correctly identifying only 26 percent of AI-written text as "likely AI-written" and incorrectly labeling human-written works 9 percent of the time”

Seems like it mostly wasn’t confident enough to make a judgement, but 26% it correctly detected ai text and 9% incorrectly identified human text as ai text. It doesn’t tell us how often it labeled AI text as human text or how often it was just unsure.

EDIT: this article https://arstechnica.com/information-technology/2023/07/openai-discontinues-its-ai-writing-detector-due-to-low-rate-of-accuracy/

[–] [email protected] 7 points 2 years ago

Probably money. Given enough money, I’m sure tiktok will ban any search term

[–] [email protected] -5 points 2 years ago

Woah there. This is a political post on a social media site.

You better stop with those non rage inducing comments.

[–] [email protected] 14 points 2 years ago (1 children)

People are dumb.

[–] [email protected] 13 points 2 years ago

This reminds me of a saying an old programming mentor told me.

“To a kid with a hammer, everything is a nail”

[–] [email protected] -1 points 2 years ago

I don’t think they want to do that anyway. If fox isn’t being put on blast, CNN is next.

[–] [email protected] 23 points 2 years ago (2 children)

There are no consequences for just about anything if you have enough money :)

[–] [email protected] 22 points 2 years ago

I mean it should always be some kind of removed 3rd party drawing the lines. But nobody in power wants to give that power up.

[–] [email protected] 3 points 2 years ago

I have to disagree. I’ve been conducting interviews for a fairly large software shop (~2000 engineers) for about 3 years now and, unless I’m doing an intern or very entry level interview, I don’t care what language they use (both personally and from a company interviewer policy), as long as they can show me they understand the principles behind the interview question (usually the design of a small file system or web app)

Most devs with a good understanding of underlying principles will be able to start working on meaningful tasks in a number of days.

It’s the candidates who spent their time deep diving into a specific tool or framework (like leaving a rails/react boot camp or something) that have the hardest time adjusting to new tools.

Plus when your language/framework falls out of favor, you’re left without much recourse.

view more: next ›