this post was submitted on 10 Nov 2024
7 points (100.0% liked)

Hackaday

297 readers
60 users here now

Fresh hacks every day

founded 3 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Goun@lemmy.ml 1 points 6 days ago (1 children)

Can someone explain what this actually is? It's a python script that generates.. screenshots? I don't get it

[–] Sphks@lemmy.dbzer0.com 6 points 5 days ago* (last edited 5 days ago)

It is based on image generators. Like Dall-e and others (more precisely videos generators like Sora). Ai based image generator take an input, like random noise, and try to fill the gaps according to one direction (usually a text like "a cat playing saxophone"). The AI have been taught what cats look like, what saxophones looks like, and what playing saxophone looks like.

Here, the AI has been taught what Minecraft first person view looks like. With hours and hours of videos of someone playing, maybe bots.

Now, if you type the forward arrow, let's zoom the picture by spreading the pixels from the center of the screen. There is blank between these pixels. Get the AI fill the blank from what it thinks Minecraft should look like. Repeat for each frame and you can go forward. Do similar things for the other commands (turn left, jump...). This way you can explore the world infinitely and the AI invents the world in real time.

I have not looked at the details, but I think that the issue is that there is no memory of the world other than what you see on the screen. If you look at the left you see something, you look at the right, then look at the left again, you see a different world. Edit. Yeah that's an issue shown in the article.