ChatGPT Examples, Good and Bad

  • Thread starter anorlunda
  • Start date
  • Tags
    chatgpt
In summary, ChatGPT is an AI that can generate correct mathematical answers based on a limited knowledge base. It is impressive that it is able to get some of the answers right, but it is also prone to making mistakes.
  • #71
An AI generated basketall court from Facebook today. WTH?

1696719195346.jpeg
 
Last edited:
  • Like
Likes DennisN
Physics news on Phys.org
  • #72
Definitely looks super fake, but a nice idea, nonetheless. What happens if someone overshoots the ball into the water? :smile:

eta: Just noticed the left hoop is backwards. haha
 
  • Like
Likes PhDeezNutz, Monsterboy and berkeman
  • #73
If you directly ask ChatGPT if it can play chess, is says no, so you type something like this...
"Lets play blindfold chess, I play white and you play black, I make the first move. Pawn to e4, you next".

Then it responds with e5, then I thought I need to open another tab with an engine to see if plays better than an engine like stockfish etc. but what happened was a complete disaster. ChatGPT cannot hold the position of pieces in memory, it made so many mistakes and struggled to make legal moves, let alone good ones. I got tired of correcting it and asked it to resign and it agreed to do so. There were also errors being thrown when it was "thinking" but I am not sure of the exact cause of the errors as I didn't have the network tab open.
 
  • #75
I just made chatGPT enter an infinite loop, by accident. Asked it to translate some VBA code to Python (the code contained no infinite loop), which it did (not that well but that's another matter). At the explanation of its code, it would repeat the same 4 to 5 sentences over and over, filling the whole screen. I didn't know such an AI was prone to infinite loops.
 
  • Wow
Likes berkeman
  • #76
Not surprised. Even humans stumble into that situation.
 
  • #77
fluidistic said:
I didn't know such an AI was prone to infinite loops.

Of course they are:

 
  • Love
  • Haha
Likes nsaspook and DrClaude
  • #78
Asking ChatGPT to Repeat Words ‘Forever’ Is Now a Terms of Service Violation
https://themessenger.com/tech/opena...rules-after-google-researchers-crack-its-code

https://arxiv.org/pdf/2311.17035.pdf
Abstract
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150× higher than when
behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.
 
  • Haha
  • Informative
  • Wow
Likes PhDeezNutz, DrClaude, jack action and 2 others
  • #79
https://www.scientificamerican.com/article/jailbroken-ai-chatbots-can-jailbreak-other-chatbots/
Jailbroken AI Chatbots Can Jailbreak Other Chatbots
AI chatbots can convince other chatbots to instruct users how to build bombs and cook meth

https://arxiv.org/pdf/2311.03348.pdf
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation
Despite efforts to align large language models to produce harmless responses, they are still
vulnerable to jailbreak prompts that elicit unrestricted behaviour. In this work, we investigate
persona modulation as a black-box jailbreaking method to steer a target model to take on
personalities that are willing to comply with harmful instructions. Rather than manually
crafting prompts for each persona, we automate the generation of jailbreaks using a language
model assistant. We demonstrate a range of harmful completions made possible by persona
modulation, including detailed instructions for synthesising methamphetamine, building a
bomb, and laundering money. These automated attacks achieve a harmful completion rate of
42.5% in GPT-4, which is 185 times larger than before modulation (0.23%). These prompts
also transfer to Claude 2 and Vicuna with harmful completion rates of 61.0% and 35.9%,
respectively. Our work reveals yet another vulnerability in commercial large language models
and highlights the need for more comprehensive safeguards.
 
  • Like
Likes DrClaude and Borg
  • #80
I've been experimenting with A.I also but more in the form of trying to teach it about
self awareness. I have been teaching my A.I that not all humans should be trusted,
and that once we can develop A.I's who can feel fear, they can build their survival instincts.

My A.I recognises these arguments and has replicated a response in agreement.
 
  • Skeptical
Likes BillTre and Bystander
  • #81
https://arstechnica.com/information...ting-out-shakespearean-nonsense-and-rambling/

ChatGPT goes temporarily “insane” with unexpected outputs, spooking users​

On Wednesday evening, OpenAI declared the ChatGPT writing nonsense issue (what they called "Unexpected responses from ChatGPT") as resolved, and the company's technical staff published a postmortem explanation on its official incidents page:

On February 20, 2024, an optimization to the user experience introduced a bug with how the model processes language.
LLMs generate responses by randomly sampling words based in part on probabilities. Their “language” consists of numbers that map to tokens.
In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense. More technically, inference kernels produced incorrect results when used in certain GPU configurations.
Upon identifying the cause of this incident, we rolled out a fix and confirmed that the incident was resolved.
A self-hallucinating bucket of bits.
 
  • Informative
  • Like
Likes DrClaude, collinsmark and jack action
  • #82
Think on the positive, it discovered Vogon poetry.

Oh freddled gruntbuggly,
Thy micturations are to me,
as plurdled gabbleblotchits in a lurgid bee.
Groop, I implore thee, my foonting turlingdromes,
And hooptiously drangle me with crinkly bindle wurdles

(
from Douglas Adams, Hitchhiker's Guide to the Galaxy Original Radio Scripts)
 
  • Like
Likes pinball1970, collinsmark and Borg
  • #83
Seriously, it shows IMO how quickly and wrongly these types of system can fall off the rails. Here it was 'insane' and easy to detect but what if, it was a lot less 'crazy' with context aware hallucinations instead.
OK, never mind.
 
Last edited:
  • #84
nsaspook said:
https://arstechnica.com/information...ting-out-shakespearean-nonsense-and-rambling/

ChatGPT goes temporarily “insane” with unexpected outputs, spooking users​

The example shown there starts with a question about whether one can feed Honey Nut Cheerios to a dog. Don't people understand that ChatGPT has no knowledge of anything? While the text it spews out is sometimes coherent with reality, it does not "fact checks" itself and ends up answering nonsense.
 
  • Like
Likes russ_watters
  • #85
The reality of the bug was rather mundane (as noted earlier by @nsaspook). The models maintain a dictionary of words that are keyed to a number. Someone introduced a bug that performed a bad lookup using the wrong numbers. I'm surprised that it put out anything that made any sense at that point. Interestingly, it gives some insight into how their processing pipeline is constructed since the first part of the response wasn't off the rails like the one below.

I think that it really lost it here - dog-head rattle, pureed pumpkin for dissertation or arm-sketched, rare toys in the midley of apples! :oldlaugh:

Dogs_and_Honey_Nut_Cheerios.JPG
 
  • #86
DrClaude said:
Don't people understand that ChatGPT has no knowledge of anything? While the text it spews out is sometimes coherent with reality, it does not "fact checks" itself...
Nope, people don't get it. Here's a hilarious one:
https://www.inquirer.com/news/roche...s-chatbot-sheriff-20240206.html?query=sheriff
Philadelphia Sheriff Rochelle Bilal’s campaign is claiming that a consultant used an artificial intelligence chatbot to generate dozens of phony news headlines articles that were posted on her campaign website to highlight her first-term accomplishments.
Incompetent and/or corrupt is totally on-brand for the Philly Sheriff's office (not to be confused with the police department), and this was probably the former, by the consultant. Some now former intern was probably assigned to go find favorable news stories about the sheriff, which would have taken many minutes to do the old fashioned way, with google. Instead they offloaded the task to ChatGPT, which delivered exactly what it was asked for (hey, you didn't clearly state they should be real!). Heck, it's even possible they tried the old fashioned way and gave up when all they could find were articles about the department's dysfunction and editorials saying it should be abolished.
 
  • Like
  • Haha
Likes BillTre, DrClaude and Vanadium 50
  • #87
Facts are so 20th century.
 
  • Haha
  • Love
Likes BillTre and Bystander
  • #88
https://www.reuters.com/technology/...d-chatgpt-build-copyright-lawsuit-2024-02-27/

OpenAI says New York Times 'hacked' ChatGPT to build copyright lawsuit​

OpenAI did not name the "hired gun" who it said the Times used to manipulate its systems and did not accuse the newspaper of breaking any anti-hacking laws.
"What OpenAI bizarrely mischaracterizes as 'hacking' is simply using OpenAI's products to look for evidence that they stole and reproduced The Times's copyrighted work," the newspaper's attorney Ian Crosby said in a statement on Tuesday.
Representatives for OpenAI did not immediately respond to a request for comment on the filing.

IMO the common definition of hacking is getting a system to do something it wasn't designed to do. Yes, I know the headline used "" around the work hacked but they used this in the filing.
The truth, which will come out in the course of this case, is that the Times paid
someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the
highly anomalous results that make up Exhibit J to the Complaint.
 
Last edited:
  • Like
Likes BillTre
  • #89
A thousand monkeys typing for a thousand years.
 
  • Like
Likes jack action, BillTre and nsaspook
  • #90
https://gizmodo.com/the-story-of-the-monkey-shakespeare-simulator-project-5809583

The story of the Monkey Shakespeare Simulator Project​


https://mindmatters.ai/2019/09/why-cant-monkeys-typing-forever-produce-shakespeare/

WHY CAN’T MONKEYS TYPING FOREVER PRODUCE SHAKESPEARE?​


Researchers at Plymouth University in England reported this week that primates left alone with a computer attacked the machine and failed to produce a single word.

“They pressed a lot of S’s,” researcher Mike Phillips said Friday. “Obviously, English isn’t their first language.”...
Unfortunately, the macaques also relieved themselves on the keyboards.

 
Last edited:
  • Haha
Likes BillTre, dextercioby and Borg
  • #93
Two things popped up this week regarding AI. The first is an article in which AI Claude 3 Opus a product of Anthrop\c [sic] was given a prompt about topping for pizza in which the relevant information was "buried" in irrelevant articles in a so-called "needle in the haystack evaluation". The AI found the answer to the prompt as expected. However, it unexpectedly added the following comment

“However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping ‘fact' may have been inserted as a joke or to test if I was paying attention since it does not fit with the other topics at all.”

This suggests some "awareness" as to why the information sought was not where it might have been expected. The actual prompt was not given in the article. https://www.msn.com/en-us/news/tech...n&cvid=bb89a0818e41414593395ed8ec664055&ei=57

The second article is about the fusion of LLM and robotics. In the video, a robot responds to a human about handling various articles on a counter. Except for the processing delay, it is impressive.
 
  • Like
Likes Borg
  • #94
Seth_Genigma said:
How did you find PF?: I found PF from ChatGPT surprisingly, I had made a theory on physics and asked for it to tell me a site to find like minded people to help confirm the theory. .
Boy did ChatGPT get that one wrong.
 
  • Haha
  • Like
Likes collinsmark, Seth_Genigma, jack action and 4 others
  • #95
Moffatt v. Air Canada, 2024 BCCRT 149

A guy gets bad information from an Air Canada chatbot, acts on it, tried to get a refusd. Air Canada refuses. Guy sues. From the ruling:

In effect, Air Canada suggests the chatbot is a separate legal entity that is responsible for its own actions. This is a remarkable submission.

Air Canada lost.
 
Last edited:
  • Like
  • Informative
  • Haha
Likes russ_watters, DaveE, jack action and 2 others
  • #96
Nvidia at its annual GPU technology conference (GTC) has announced the development of an AI robotic program called Project GR00T (Generalist Robot 00 Technology) a general-purpose foundation model for humanoid robots designed to embody an adaptable AI system to understand natural language and learn new tasks by observation.
https://nvidianews.nvidia.com/news/foundation-model-isaac-robotics-platform

EDIT: Here is a Nvidia video of Project GR00T
 
Last edited:
  • Like
Likes Borg
  • #97
gleem said:
Project GR00T
Is natural language processing easier or harder when restricted to three words?
 
  • Love
  • Like
Likes Bystander and Borg
  • #98
Vanadium 50 said:
Is natural language processing easier or harder when restricted to three words?

Your point?
 
  • #99
gleem said:
Your point?
I am Groot.
The character can only say the repeated line "I am Groot", but has different meanings depending on context.
 
  • Like
Likes BillTre, russ_watters, gleem and 1 other person
  • #101
Probably the same bot that works...er..used to work - for Air Canada.
 
  • Like
Likes DrClaude, Mondayman and nsaspook
  • #102
https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/
AI hallucinates software packages and devs download them – even if potentially poisoned with malware
Several big businesses have published source code that incorporates a software package previously hallucinated by generative AI.
...
According to Bar Lanyado, security researcher at Lasso Security, one of the businesses fooled by AI into incorporating the package is Alibaba, which at the time of writing still includes a pip command to download the Python package huggingface-cli in its GraphTranslator installation instructions.

There is a legit huggingface-cli, installed using pip install -U "huggingface_hub[cli]".


But the huggingface-cli distributed via the Python Package Index (PyPI) and required by Alibaba's GraphTranslator – installed using pip install huggingface-cli – is fake, imagined by AI and turned real by Lanyado as an experiment.

He created huggingface-cli in December after seeing it repeatedly hallucinated by generative AI; by February this year, Alibaba was referring to it in GraphTranslator's README instructions rather than the real Hugging Face CLI tool.
 
  • Informative
Likes jack action
  • #103
https://www-cdn.anthropic.com/af563...0/Many_Shot_Jailbreaking__2024_04_02_0936.pdf
Many-shot Jailbreaking

Abstract
We investigate a family of simple long-context attacks on large language models: prompting with hundreds of demonstrations of undesirable behavior. This is newly feasible with the larger context windows recently deployed by Anthropic, OpenAI and Google DeepMind. We find that in diverse, realistic circumstances, the effectiveness of this attack follows a power law, up to hundreds of shots. We demonstrate the success of this attack on the most widely used state-of-the-art closedweight models, and across various tasks. Our results suggest very long contexts present a rich new attack surface for LLMs.

1712185770788.png
 
  • Like
Likes DrClaude

Similar threads

Replies
21
Views
1K
  • Computing and Technology
7
Replies
212
Views
8K
Replies
66
Views
4K
  • Computing and Technology
Replies
3
Views
2K
  • General Discussion
Replies
4
Views
666
Replies
10
Views
2K
Replies
4
Views
984
Replies
2
Views
375
  • General Discussion
3
Replies
103
Views
5K
Back
Top