Martineski

joined 1 year ago
MODERATOR OF
[–] Martineski@lemmy.fmhy.ml -3 points 1 year ago (2 children)

Thanks, I'm blocking this community.

[–] Martineski@lemmy.fmhy.ml 8 points 1 year ago

The pig one tho...

[–] Martineski@lemmy.fmhy.ml 1 points 1 year ago

Just stumbled upon this post while looking for news and I think that it's a great example of why implementing this rule was a right call: https://www.reddit.com/r/singularity/comments/14u6x5p/toyota_claims_battery_breakthrough_with_a_range/

 

In human conversations, individuals can indicate relevant regions within a scene while addressing others. In turn, the other person can then respond by referring to specific regions if necessary. This natural referential ability in dialogue remains absent in current Multimodal Large Language Models (MLLMs). To fill this gap, this paper proposes an MLLM called Shikra, which can handle spatial coordinate inputs and outputs in natural language. Its architecture consists of a vision encoder, an alignment layer, and a LLM. It is designed to be straightforward and simple, without the need for extra vocabularies, position encoder, pre-/post-detection modules, or external plug-in models. All inputs and outputs are in natural language form. Referential dialogue is a superset of various vision-language (VL) tasks. Shikra can naturally handle location-related tasks like REC and PointQA, as well as conventional VL tasks such as Image Captioning and VQA. Experimental results showcase Shikra's promising performance. Furthermore, it enables numerous exciting applications, like providing mentioned objects' coordinates in chains of thoughts and comparing user-pointed regions similarities. Our code, model and dataset are accessed at this https URL.

 

Amazon CEO Andy Jassy called generative A.I. “one of the biggest technical transformations of our lifetimes” in an interview with CNBC on Thursday. He also called many of today’s A.I. chatbots and other generative A.I. tools part of the “hype cycle,” declaring that Amazon was focused on the “substance cycle.”

Amazon’s bona fides in the space are well established, having been a player in artificial intelligence and machine learning long before the ChatGPTs and Bards of the world were publicly released. Former Fortune editor Brian Dumaine wrote a book in 2020 about how Amazon founder Jeff Bezos realized early on that imbuing machine learning into every facet of the company would allow it to gather data to constantly improve itself.

Much as it did with Amazon Web Services, which practically birthed the cloud computing industry that now powers the internet’s biggest companies, including its competitors, Amazon’s A.I. strategy is focused on cementing its position as a major player across the entirety of the A.I. supply chain.

“Every single business unit inside of Amazon is working intensely and very broadly on generative A.I.,” Jassy says.

Jassy shed some light on Amazon’s A.I. game plan, outlining three macro layers: the computing capabilities, the underlying models, and what Jassy refers to as the “application layer,” for example, ChatGPT or Bard.

 

For 3D object manipulation, methods that build an explicit 3D representation perform better than those relying only on camera images. But using explicit 3D representations like voxels comes at large computing cost, adversely affecting scalability. In this work, we propose RVT, a multi-view transformer for 3D manipulation that is both scalable and accurate. Some key features of RVT are an attention mechanism to aggregate information across views and re-rendering of the camera input from virtual views around the robot workspace. In simulations, we find that a single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct). It also trains 36X faster than PerAct for achieving the same performance and achieves 2.3X the inference speed of PerAct. Further, RVT can perform a variety of manipulation tasks in the real world with just a few (∼10) demonstrations per task. Visual results, code, and trained model are provided at this https URL.

 

Covid-19 is said to cause long-term side effects in up to 67% of patients, and these health consequences can include chronic fatigue, loss of taste and smell and brain fog. Increasingly common too is Covid-related hair loss. Known as telogen effluvium, this phenomenon manifests as clumps of hair falling out after brushing or washing your hair.

It’s normal to shed hair daily – we lose about 100-150 hairs each day as hair drops from follicles to make way for new hair growth. This growth cycle occurs because 90% of the hair on our heads is in a growth phase (called anagen), while the remaining 10% is in a resting phase (called telogen). Anagen lasts for about three years before transitioning into the shorter telogen phase, following which hair is shed.

A stressful event like childbirth, certain medications, intense psychological stress and Covid-19 can trigger our bodies to shift a greater-than-normal proportion of growing anagen hairs into a resting telogen state, according to the University of Utah.

“Covid-related hair loss can affect up to 33% of symptomatic patients and 10% of asymptomatic patients,” says a plastic surgeon who deals with hair loss patients. “And this kind of hair loss seems to be different from that induced by stress or disease as cytokines (substances secreted by the body’s immune system) appear to cause direct damage to hair follicles,” she adds.

Covid-induced hair loss has also been reported to start earlier after the stressful event – in two months instead of the usual three.

 

Recent work suggests that interpolating between the weights of two specialized language models can transfer knowledge between tasks in a way that multi-task learning cannot. However, very few have explored interpolation between more than two models, where each has a distinct knowledge base. In this paper, we introduce Derivative Free Weight-space Ensembling (DFWE), a new few-sample task transfer approach for open-domain dialogue. Our framework creates a set of diverse expert language models trained using a predefined set of source tasks. Next, we finetune each of the expert models on the target task, approaching the target task from several distinct knowledge bases. Finally, we linearly interpolate between the model weights using a gradient-free-optimization algorithm, to efficiently find a good interpolation weighting. We demonstrate the effectiveness of the method on FETA-Friends outperforming the standard pretrain-finetune approach.

 

The idea is simple - Specify what you want to research, and the AI will autonomously research it for you in minutes!

▸ One prompt generates an unbiased, factual and in depth research report

▸ Generate research, outlines, resource and lessons reports

▸ Aggregates over 20 web sources per research

▸ Includes an easy to use web interface

▸ Open source: https://github.com/assafelovic/gpt-researcher

▸ Scrapes web sources with javascript support

▸ Keeps track and context of visited and used web sources

 

Abstract:

Since the first laser was invented, the pursuit of high-energy lasers (HELs) has always been enthusiastic. The first revolution of HELs was pushed by the fusion of laser and aerospace in the 1960s, with the chemical rocket engines giving fresh impetus to the birth of gas flow and chemical lasers, which finally turned megawatt lasers from dream into reality. Nowadays, the development of HELs has entered the age of electricity as well as the rocket engines. The properties of current electric rocket engines are highly consistent with HELs’ goals, including electrical driving, effective heat dissipation, little medium consumption and extremely light weight and size, which inspired a second fusion of laser and aerospace and motivated the exploration for potential HELs. As an exploratory attempt, a new configuration of diode pumped metastable rare gas laser was demonstrated, with the gain generator resembling an electric rocket-engine for improved power scaling ability.

 

Original title: Focused Transformer: Contrastive Training for Context Scaling

Large language models have an exceptional capability to incorporate new information in a contextual manner. However, the full potential of such an approach is often restrained due to a limitation in the effective context length. One solution to this issue is to endow an attention layer with access to an external memory, which comprises of (key, value) pairs. Yet, as the number of documents increases, the proportion of relevant keys to irrelevant ones decreases, leading the model to focus more on the irrelevant keys. We identify a significant challenge, dubbed the distraction issue, where keys linked to different semantic values might overlap, making them hard to distinguish. To tackle this problem, we introduce the Focused Transformer (FoT), a technique that employs a training process inspired by contrastive learning. This novel approach enhances the structure of the (key, value) space, enabling an extension of the context length. Our method allows for fine-tuning pre-existing, large-scale models to lengthen their effective context. This is demonstrated by our fine-tuning of 3B and 7B OpenLLaMA checkpoints. The resulting models, which we name LongLLaMA, exhibit advancements in tasks requiring a long context. We further illustrate that our LongLLaMA models adeptly manage a 256k context length for passkey retrieval.

 
7
hmmm (lemmy.fmhy.ml)
submitted 1 year ago* (last edited 1 year ago) by Martineski@lemmy.fmhy.ml to c/hmmm@lemmy.fmhy.ml
 
[–] Martineski@lemmy.fmhy.ml 1 points 1 year ago

There's so many test posts lately.

[–] Martineski@lemmy.fmhy.ml 2 points 1 year ago

It's not a commonly seen rule. I do it for transparency reasons. I will make a post about the rules and will pin it but first I will need to rewrite some rules and I need to find a time for that when I'm really busy. :x

[–] Martineski@lemmy.fmhy.ml 1 points 1 year ago (3 children)

Add the date of the article to the title as per our rule 6. Copy this:

(article from 19.07.2023)

Thank you.

[–] Martineski@lemmy.fmhy.ml 9 points 1 year ago (5 children)

You mean my post

[–] Martineski@lemmy.fmhy.ml 2 points 1 year ago

You're right, seems like it:

Currently, Claude 2 API is available to businesses only. Additionally, to gain access, you need to send a request to the Anthropic team.

However, if you live anywhere except the US or UK, you cannot use Claude 2.

Source: https://thenaturehero.com/claude-2-api/ ("Claude 2 API – Everything You Need To Know")

[–] Martineski@lemmy.fmhy.ml 2 points 1 year ago (2 children)

We are pleased to announce Claude 2, our new model. Claude 2 has improved performance, longer responses, and can be accessed via API as well as a new public-facing beta website, claude.ai.

[–] Martineski@lemmy.fmhy.ml 2 points 1 year ago* (last edited 1 year ago) (1 children)

Congrats!

(Also, I was first to hit 3k)

view more: next ›