Singularity | Artificial Intelligence (ai), Technology & Futurology

6 readers
1 users here now

About:

This sublemmy is a place for sharing news and discussions about artificial intelligence, core developments of humanity's technology and societal changes that come with them. Basically futurology sublemmy centered around ai but not limited to ai only.

Rules:
  1. Posts that don't follow the rules and don't comply with them after being pointed out that they break the rules will be deleted no matter how much engagement they got and then reposted by me in a way that follows the rules. I'm going to wait for max 2 days for the poster to comply with the rules before I decide to do this.
  2. No Low-quality/Wildly Speculative Posts.
  3. Keep posts on topic.
  4. Don't make posts with link/s to paywalled articles as their main focus.
  5. No posts linking to reddit posts.
  6. Memes are fine as long they are quality or/and can lead to serious on topic discussions. If we end up having too much memes we will do meme specific singularity sublemmy.
  7. Titles must include information on how old the source is in this format dd.mm.yyyy (ex. 24.06.2023).
  8. Please be respectful to each other.
  9. No summaries made by LLMs. I would like to keep quality of comments as high as possible.
  10. (Rule implemented 30.06.2023) Don't make posts with link/s to tweets as their main focus. Melon decided that the content on the platform is going to be locked behind login requirement and I'm not going to force everyone to make a twitter account just so they can see some news.
  11. No ai generated images/videos unless their role is to represent new advancements in generative technology which are not older that 1 month.
  12. If the title of the post isn't an original title of the article or paper then the first thing in the body of the post should be an original title written in this format "Original title: {title here}".
  13. Please be respectful to each other.

Related sublemmies:

!auai@programming.dev (Our community focuses on programming-oriented, hype-free discussion of Artificial Intelligence (AI) topics. We aim to curate content that truly contributes to the understanding and practical application of AI, making it, as the name suggests, “actually useful” for developers and enthusiasts alike.)

Note:

My posts on this sub are currently VERY reliant on getting info from r/singularity and other subreddits on reddit. I'm planning to at some point make a list of sites that write/aggregate news that this subreddit is about so we could get news faster and not rely on reddit as much. If you know any good sites please dm me.

founded 1 year ago
MODERATORS
26
 
 

PIKA LABS site: https://www.pika.art/demo

27
 
 

A protein secreted by seemingly dormant cells in skin moles causes hair to grow again. That’s a big—and potentially useful—surprise.

28
 
 

Looking for a self hosting solution or some free alternatives for either chatgpt 4 or ideally sudowrite.

The latter does most of what I want it to do, but the free trial is super limiting and the paid tier is asking too much for not much of a word bump.

The former is only available with a sub but it requires way too much fiddling considering I'd be paying 20$ a month to do what I want but is an option. Sadly, I need gpt4 if I go that route and can't cruise on 3.5 for free.

I'm writing a novel and while I don't care about AI doing the actual writing for me, I do want something to help me organize my ideas or even brainstorm. gpt 3.5 just doesn't have the token bandwidth to do that. Sudowrite does an excellent job with it, but the pricing is stupid at 10$ for 30k words. I went through the 4k free trial just trying to figure out how it works.

I know there's a slew of self hosting chatbots but I haven't seen anyone use them for writing and searching huggingface is a pita.

Google bard could be an option but haven't found a way to jailbreak it and Claude is not available in my country.

Any ideas?

29
30
 
 

Bing (multimodal) image input is free!

31
 
 

Apple designer and Humane cofounder Imran Chaudhri envisions a future where AI enables using a phone without a screen

32
33
 
 

Abstract:

Large language models (LLMs) have demonstrated impressive results in developing generalist planning agents for diverse tasks. However, grounding these plans in expansive, multi-floor, and multi-room environments presents a significant challenge for robotics. We introduce SayPlan, a scalable approach to LLM-based, large-scale task planning for robotics using 3D scene graph (3DSG) representations. To ensure the scalability of our approach, we: (1) exploit the hierarchical nature of 3DSGs to allow LLMs to conduct a semantic search for task-relevant subgraphs from a smaller, collapsed representation of the full graph; (2) reduce the planning horizon for the LLM by integrating a classical path planner and (3) introduce an iterative replanning pipeline that refines the initial plan using feedback from a scene graph simulator, correcting infeasible actions and avoiding planning failures. We evaluate our approach on two large-scale environments spanning up to 3 floors, 36 rooms and 140 objects, and show that our approach is capable of grounding large-scale, long-horizon task plans from abstract, and natural language instruction for a mobile manipulator robot to execute. We provide real robot video demonstrations and code on our project page sayplan.github.io.

paper: https://arxiv.org/pdf/2307.06135.pdf

Video: https://cdn-uploads.huggingface.co/production/uploads/6258561f4d4291e8e63d8ae6/d_U_pzeCoJ2dTcBWz6n0r.mp4

34
 
 

Bard is available in new places and languages

  • What: Bard is now available in over 40 new languages including Arabic, Chinese (Simplified/Traditional), German, Hindi, Spanish, and more. We have also expanded access to more places, including all 27 countries in the European Union (EU) and Brazil.

  • Why: Bard is global and is intended to help you explore possibilities. Our English, Japanese, and Korean support helped us learn how to launch languages responsibly, enabling us to now support the majority of language coverage on the internet.

Google Lens in Bard

  • What: You can upload images alongside text in your conversations with Bard, allowing you to boost your imagination and creativity in completely new ways. To make this happen, we’re bringing the power of Google Lens into Bard, starting with English.

  • Why: Images are a fundamental part of how we put our imaginations to work, so we’ve added Google Lens to Bard. Whether you want more information about an image or need inspiration for a funny caption, you now have even more ways to explore and create with Bard.

Bard can read responses out loud

  • What: We’re adding text-to-speech capabilities to Bard in over 40 languages, including Hindi, Spanish, and US English.

  • Why: Sometimes hearing something aloud helps you bring an idea to life in new ways beyond reading it. Listen to responses and see what it helps you imagine and create!

Pinned & Recent Threads

  • What: You can now pick up where you left off with your past Bard conversations and organize them according to your needs. We’ve added the ability to pin conversations, rename them, and have multiple conversations going at once.

  • Why: The best ideas take time, sometimes multiple hours or days to create. Keep your threads and pin your most critical threads to keep your creative process flowing.

Share your Bard conversations with others

  • What: We’ve made it easier to share part or all of your Bard chat with others. Shareable links make seeing your chat and any sources just a click away so others can seamlessly view what you created with Bard.

  • Why: It’s hard to hold back a new idea sometimes. We wanted to make it easier for you to share your creations to inspire others, unlock your creativity, and show your collaboration process.

Modify Bard’s responses

  • What: We’re introducing 5 new options to help you modify Bard’s responses. Just tap to make the response simpler, longer, shorter, more professional, or more casual.

  • Why: When a response is close enough but needs a tweak, we’re making it easier to get you closer to your desired creation.

Export Python code to Replit

  • What: We’re continuing to expand Bard’s export capabilities for code. You can now export Python code to Replit, in addition to Google Colab.

  • Why: Streamline your workflow and continue your programming tasks by moving Bard interactions into Replit.

35
 
 

With LLaMA V2, Meta may be trying to benefit from the open-source community, similar to what Google has done with Android.

The Financial Times, citing three sources familiar with the project, reports that Meta wants to launch a commercial AI model to compete with OpenAI, Microsoft, and Google. The model is said to generate language, code, and images.

It may be a new variant of Meta's LLaMA, a large language model used in numerous open-source projects. LLaMA v1 has only been released under a research license and therefore may not be used directly for commercial purposes. However, replicas exist.

Meta CEO Mark Zuckerberg has already announced that a new AI model is in the works, which could be LLaMA v2 or under a different name. Meta wants to use the model for its services and offer it to external interested parties, according to Zuckerberg. Special attention is safety.

36
 
 

Comment ~~stolen~~ borrowed from reddit:

TL;DR: The Federal Trade Commission (FTC) is investigating OpenAI, the maker of ChatGPT, for potential violations of consumer protection laws. The investigation will focus on whether OpenAI has engaged in unfair or deceptive privacy or data security practices, or if it has engaged in unfair or deceptive practices relating to risks of harm to consumers. The FTC has asked OpenAI to provide information on how it obtains and uses consumer information to train its large language models, how it assesses risk, and how it deals with misleading or disparaging statements about people. The FTC is also seeking information about a bug disclosed by OpenAI in March 2020 that may have exposed some users' chat history and payment-related information. OpenAI has not yet responded to the investigation.

37
 
 

The Associated Press (AP) and OpenAI have announced a new partnership that seeks to examine potential use cases for generative AI in news products and services. As part of the arrangement, OpenAI will license part of AP's text archive and, in return, AP will tap into OpenAI's advanced technology and product expertise.

38
 
 

Like without any human interventions. What would our routine be like?

39
 
 

Stability AI launches a web tool that allows you to turn your doodles into pretty AI-generated images: https://clipdrop.co/stable-doodle

It's not clear what the free daily limit is, but it seems to be about 3 prompts every hour. Results are pretty neat.

A crude drawing of a cat's head prompted: a picture of a cat head in space. The results are good.

40
41
 
 

TL;DR: (AI-generated 🤖)

The author, an early pioneer in the field of aligning artificial general intelligence (AGI), expresses concern about the potential dangers of creating a superintelligent AI. They highlight the lack of understanding and control over modern AI systems, emphasizing the need to shape the preferences and behavior of AGI to ensure it doesn't harm humanity. The author predicts that the development of AGI smarter than humans, with different goals and values, could lead to disastrous consequences. They stress the urgency and seriousness required in addressing this challenge, suggesting measures such as banning large AI training runs to mitigate the risks. Ultimately, the author concludes that humanity must confront this issue with great care and consideration to avoid catastrophic outcomes.

42
43
 
 

GPT-4's details are leaked.

It is over.

Everything is here: https://archive.is/2RQ8X

Parameters count:

GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters across 120 layers.

Mixture Of Experts - Confirmed.

OpenAI was able to keep costs reasonable by utilizing a mixture of experts (MoE) model. They utilizes 16 experts within their model, each is about ~111B parameters for MLP. 2 of these experts are routed to per forward pass.

MoE Routing:

While the literature talks a lot about advanced routing algorithms for choosing which experts to route each token to, OpenAI’s is allegedly quite simple, for the current GPT-4 model.

There roughly ~55B shared parameters for attention.

Inference:

Each forward pass inference (generation of 1 token) only utilizes ~280B parameters and ~560 TFLOPs. This contrasts with the ~1.8 trillion parameters and ~3,700 TFLOP that would be required per forward pass of a purely dense model.

Dataset:

GPT-4 is trained on ~13T tokens.

These are not unique tokens, they count the epochs as more tokens as well.

Epoch number: 2 epochs for text-based data and 4 for code-based data.

There is millions of rows of instruction fine-tuning data from ScaleAI & internally.

GPT-4 32K

There was an 8k context length (seqlen) for the pre-training phase. The 32k seqlen version of GPT-4 is based on fine-tuning of the 8k after the pre-training.

Batch Size:

The batch size was gradually ramped up over a number of days on the cluster, but by the end, OpenAI was using a batch size of 60 million! This, of course, is “only” a batch size of 7.5 million tokens per expert due to not every expert seeing all tokens.

For the real batch size:

Divide this number by the seq len to get the real batch size. just stop with this misleading numbers already.

Parallelism Strategies

To parallelize across all their A100s GPUs They utilized 8-way tensor parallelism as that is the limit for NVLink.

Beyond that, they are using 15-way pipeline parallelism.

(likely used ZeRo Stage 1. It is possible they used block-level FSDP)

Training Cost

OpenAI’s training FLOPS for GPT-4 is ~2.15e25, on ~25,000 A100s for 90 to 100 days at about 32% to 36% MFU.

Part of this extremely low utilization is due to an absurd number of failures requiring checkpoints that needed to be restarted from.

If their cost in the cloud was about $1 per A100 hour, the training costs for this run alone would be about $63 million.

(Today, the pre-training could be done with ~8,192 H100 in ~55 days for $21.5 million at $2 per H100 hour.)

Mixture of Expert Tradeoffs

There are multiple MoE tradeoffs taken: For example, MoE is incredibly difficult to deal with on inference because not every part of the model is utilized on every token generation.

This means parts may sit dormant when other parts are being used. When serving users, this really hurts utilization rates.

Researchers have shown that using 64 to 128 experts achieves better loss than 16 experts, but that’s purely research.

There are multiple reasons to go with fewer experts. One reason for OpenAI choosing 16 experts is because more experts are difficult to generalize at many tasks. More experts can also be more difficult to achieve convergence with.

With such a large training run, OpenAI instead chose to be more conservative on the number of experts.

GPT-4 Inference Cost

GPT-4 costs 3x that of the 175B parameter Davincci.

This is largely due to the larger clusters required for GPT-4 and much lower utilization achieved.

AN estimate of it's costs is $0.0049 cents per 1k tokens for 128 A100s to inference GPT-4 8k seqlen and $0.0021 cents per 1k tokens for 128 H100’s to inference GPT-4 8k seqlen. It should be noted, we assume decent high utilization, and keeping batch sizes high.

Multi-Query Attention

OpenAI are using MQA just like everybody else.

Because of that only 1 head is needed and memory capacity can be significantly reduced for the KV cache. Even then, the 32k seqlen GPT-4 definitely cannot run on 40GB A100s, and the 8k is capped on max bsz.

Continuous batching

OpenAI implements both variable batch sizes and continuous batching. This is so as to allow some level of maximum latency as well optimizing the inference costs.

Vision Multi-Modal

It is a separate vision encoder from the text encoder, with cross-attention. The architecture is similar to Flamingo. This adds more parameters on top of the 1.8T of GPT-4. It is fine-tuned with another ~2 trillion tokens, after the text only pre-training.

On the vision model, OpenAI wanted to train it from scratch, but it wasn’t mature enough, so they wanted to derisk it by starting with text.

One of the primary purposes of this vision capability is for autonomous agents able to read web pages and transcribe what’s in images and video.

Some of the data they train on is joint data (rendered LaTeX/text), screen shots of web page, youtube videos: sampling frames, and run Whisper around it to get transcript.

[Dont want to say "I told you so" but..]

Speculative Decoding

OpenAI might be using speculative decoding on GPT-4's inference. (not sure 100%)

The idea is to use a smaller faster model to decode several tokens in advance, and then feeds them into a large oracle model as a single batch.

If the small model was right about its predictions – the larger model agrees and we can decode several tokens in a single batch.

But if the larger model rejects the tokens predicted by the draft model then the rest of the batch is discarded. And we continue with the larger model.

The conspiracy theory that the new GPT-4 quality had been deteriorated might be simply because they are letting the oracle model accept lower probability sequences from the speculative decoding model.

Inference Architecture

The inference runs on a cluster of 128 GPUs.

There are multiple of these clusters in multiple datacenters in different locations.

It is done in 8-way tensor parallelism and 16-way pipeline parallelism.

Each node of 8 GPUs has only ~130B parameters, or… twitter.com/i/web/status/1…

The model has 120, so it fits in 15 different nodes. [Possibly the there are less layers on the first node since it needs to also compute the embeddings]

According to these numbers: OpenAI should have trained on 2x the tokens if they were trying to go by chinchilla's optimal.

[let alone surpass it like we do]

This goes to show that they are struggling to get high quality data. Why no FSDP?

A possible reason for this could be that some of the hardware infra they secured is of an older generation.

This is pretty common at local compute clusters as the organisation usually upgrade the infra in several "waves" to avoid a complete pause of operation.… twitter.com/i/web/status/1…

Dataset Mixture

They trained on 13T tokens.

CommonCrawl & RefinedWeb are both 5T.

Remove the duplication of tokens from multiple epochs and we get to a much reasonable number of "unaccounted for" tokens: The "secret" data.

Which by this point we already get rumors that parts of it came from twitter, reddit & youtube.

[Rumors that start to become lawsuits]

Some speculations are:

  • LibGen (4M+ books)
  • Sci-Hub (80M+ papers)
  • All of GitHub

My own opinion:

The missing dataset it a custom dataset of college textbooks collected by hand for as much courses as possible.

This is very easy to convert to txt file and than with self-instruct into instruction form.

This creates the "illusion" that GPT-4 "is smart" no matter who use it.

Computer scientist? sure! it can help you with your questions about P!=NP

Philosophy major? It can totally talk to you about epistemology.

Don't you see?

It was trained on the textbooks. It is so obvious.

There are also papers that try to extract by force memorized parts of books from GPT-4 to understand what it trained on.

There are some books it knows so well that it had seen them for sure.

Moreover, If i remember correctly: It even know the unique ids of project Euler exes.

44
 
 

Summary:

Focused Transformer: A new technique for long-context language modeling. The paper introduces Focused Transformer (FOT), a method that uses contrastive learning and external memory to improve the structure of the (key, value) space and extend the context length of transformer models. FOT can fine-tune existing large models without changing their architecture and achieve better performance on tasks that require long context.

LONGLLAMA: Extending LLaMA’s context length with FOT. The paper demonstrates the application of FOT to fine-tune OpenLLaMA models, which are large language models with memory augmentation. The resulting models, called LONGLLAMAs, can handle a context length of up to 256k tokens and show improvements on few-shot learning tasks such as TREC and WebQS.

Distraction issue: A key challenge for scaling context length. The paper identifies the distraction issue as a major obstacle for using large memory databases in multi-document scenarios. The distraction issue occurs when keys from irrelevant documents overlap with keys from relevant ones, making them hard to distinguish. FOT alleviates this issue by exposing the memory attention layer to both positive and negative examples during training.

ELI5

Sure! Imagine you have a toy box with lots of toys inside. You want to find your favorite toy, but there are so many toys that it's hard to find it. The Focused Transformer is like a special helper that can look inside the toy box and find your favorite toy quickly, even if there are lots of other toys in the way. It does this by remembering which toys are important and which ones are not, so it can find the right toy faster. Does that make sense?

Implications

The Focused Transformer (FOT) technique has the potential to improve the performance of language models by extending their context length. This means that the models can better understand and incorporate new information, even when it is spread across a large number of documents. The resulting LONGLLAMA models show significant improvements on tasks that require long-context modeling, such as retrieving information from large databases. This research could have implications for natural language processing, code generation, quantitative reasoning, and theorem proving, among other areas. It could also make it easier to fine-tune existing large-scale models to lengthen their effective context. Is there anything else you would like to know?

45
 
 

Superintelligence when? This question is more urgent than ever as we hear competing timelines from Inflection AI, OpenAI and the leaders at the Center for AI Safety. This video not only covers the what was said, it offers some data on what superintelligence is projected to be capable of.

I discuss things that can hasten those timelines (see the new Netflix doc, Top 1% for Creativity and the AI Natural Selection paper) or slow it down (ft. Yuval Noah Hariri, the new Jailbroken paper, and more). And I end on some reflections of what it might mean to interact with a superintelligence (ft Douglas Hofstadter).

46
 
 

[YANDHI - WAR WITH THE MATRIX (KANYE AI X BIG BABY GANDHI)](https://youtube.com/watch?v=CGyPqImBOjY

47
48
 
 

The ⁠alignment-minetest project just put out a new blog post detailing what we've been working on for the past several months.

The post is titled "Minetester: A fully open RL environment built on Minetest" and covers:

  • The minetester framework and how it relates to existing efforts in ai for Minecraft

  • A PPO baseline and environment customization using the framework.

  • Basic interpretability work we did on the learned PPO policy.

  • General and specific takeaways from our work so far.

  • Next steps for the project.

Relevant Links:

Blog Post: https://blog.eleuther.ai/minetester-intro/

Minetest Channel: https://discord.com/channels/729741769192767510/1014999314835181650

Minetester Repo: https://github.com/EleutherAI/minetest/

Minetester Baselines: https://github.com/EleutherAI/minetest-baselines/

Interpretabilty Notebook: https://github.com/EleutherAI/minetest-interpretabilty-notebook

Special thanks to @rkla @nev and @Eduuu for their major contributions to the project. Additional thanks to @josiah_wi and @Delta for their contributions.


Martineski: I got the announcement on this dc server: https://discord.gg/AY2Hg2qj

Direct link to the announcement: https://discord.com/channels/729741769192767510/794042109048651818/1127288927888343084

49
50
 
 

Original title: The US Military Is Taking Generative AI Out for a Spin

Summary: The US military is testing five LLMs as part of an eight-week exercise run by the Pentagon’s digital and AI office. "It was highly successful. It was very fast," a US Air Force colonel is quoted as saying. "We did it with secret-level data," he adds, saying that it could be deployed by the military in the very near term.

view more: ‹ prev next ›