MELLOW MOTIVE
Posts
Apple's AI Paradigm Shift and Adobe's New AI Capabilities.

Apple's AI Paradigm Shift and Adobe's New AI Capabilities.

Tencent shows off GameGen-O, their open-world game engine.

Dan MacDougall
October 18, 2024

AI Riddle:

I build up castles,
I tear down mountains.
I make some men blind,
I help others to see.

What am I?

Today’s Motive:

🍎 Apple flips the table on AI. What is really going on?
📽️ Adobe unveils their new AI video and image capabilities.
🕹️ Tencent’s GameGen-O marks the beginning of something big.

🤑 Checkout the latest AI deals below. 💸

🛠️ Get the scoop on the latest AI tools at glance section.

AI NEWS

😟 Apple's AI Announcement: Surprising New Direction You Didn't See Coming.

image created in Midjourney

🧩 Recent research conducted by Apple suggests that Large Language Models (LLMs) may not exhibit true logical reasoning as previously thought.
Link here for research paper.

Summary:

Research indicates that LLMs rely more on “sophisticated pattern matching” than “true logical reasoning.” This even goes for OpenAI’s o1 advanced reasoning model.

Over 20 LLMs were tested, including OpenAI’s o1 and GPT-4o, Google’s Gemma 2, Mistral-7b, Phi-3-medium, and Meta’s Llama 3.

A common benchmark for assessing reasoning skills in LLMs is the GSM8K test. There’s reason to believe models might have been trained on the test data, leading to inflated performance metrics.
The findings revealed a notable decrease in performance across all models when benchmark variables were modified.

OpenAI’s models consistently outperformed other models but still showed reduced accuracy under altered conditions. When superfluous phrases were added to mathematical problems, there was a significant drop in performance.
OpenAI’s advanced model experienced a 17.5% decrease in accuracy, while some models saw reductions as high as 65%. Phi-3 models and Gemma2 models showed the biggest performance drop.

The researches have introduced GSM-Symbolic, an enhanced benchmark that changes variables, like names, numbers, adding irrelevant information, and complexity.

The study concluded that “models tend to convert statements to operations without truly understanding their meaning,” supporting the hypothesis that LLMs primarily engage in pattern recognition rather than genuine logical reasoning.

The Motive:

If the results are true and supported by the community, it signifies that we need to be more aware of what we are creating. The test shows that AI systems do not understand the context they create but rather find current patterns with data.

It bears mentioning that the authors of this study work for Apple, which is obviously a major competitor with Google, Meta, and even OpenAI — although Apple and OpenAI have a partnership, Apple is also working on its own AI models.

The reliance on pattern matching over true understanding may limit the applicability of LLMs in tasks requiring deep comprehension and reasoning. We may be further away from AGI than we thought.

Today’s top AI tool at a Fraction of the price.

image provided from oncely.com

See oncely.com for more deals on AI tools.

AI TOOLS

🔥 Adobe Unveils Their New Generative AI Video and Image Models.

image from Adobe

Adobe has launched its Firefly Video Model, introducing a range of innovative tools for video creators, including features integrated directly into Premiere Pro.
Adobe’s Text-to-Video and Image-to-Video tools, first announced in September, are now rolling out as a limited public beta in the Firefly web app.
Check out Adobe Firefly for more information.

Summary:

Generative Extend: Allows users to extend the start or end of clips by 2 seconds or make mid-shot adjustments, such as correcting eye-lines or unexpected movements.
It supports resolutions of 720p or 1080p at 24 FPS and can also smooth audio transitions by extending sound effects and ambient noise.

image provided from Adobe

Image-to-Video: Add a reference image alongside text prompts to provide more control over the results. Great for B-roll content from images and short clips.
Text-to-video: Can create 5 second clips mimicking various film styles, such as traditional film, 3D animation, stop motion, and offers camera controls for refining videos.
Content Credentials: You can now ensure proper disclosure of AI-generated content and ownership rights when published online.

The Motive:

Currently, the maximum length for these AI-generated clips is five seconds, with output limited to 720p at 24 FPS. In comparison, competitors like OpenAI’s Sora can produce videos up to one minute long while maintaining high visual quality.

Adobe has been struggling as of late with onset of AI into video and image editing. Many small start-ups are able to produce great video and image editing quality without the expensive monthly fees plaguing Adobe users.

It still is not clear when these new features will be out of beta, but at least they are up and running - which is more than OpenAI’s Sora, Google’s Veo, and Meta’s Movie Gen.

AI Tools at a glance:

Timetackle.com: Integrate AI into Google and Outlook calendars for automatic event and time tracking.
Palette.fm: Use AI to colorize old images and photographs from a variety of filters.
Magic-sketchpad.glitch.me: Use this app to let AI finish and touch-up drawings.

AI BREAKTHROUGH

🎮️ Tencent’s GameGen-O Develops Immersive AI Open-World videogames.

image from GameGen-O

🕹️ AI is set to revolutionize open-world video game development sooner than expected. Introducing GameGen-O from Tencent, the first diffusion transformer model specifically engineered to create open-world video games.

Summary:

The model enables high-quality, open-domain generation by emulating a vast array of game engine features, including innovative characters, dynamic environments, complex actions, and diverse events.
Researchers compiled the first open-world game dataset (OGameData) compiling data from over a 100 next-generation open-world games.

image from GamesGen-O

The model undergoes a two-stage training process: foundation model pretraining and instruction tuning.

First phase: GameGen-O is pre-trained on OGameData through text-to-video and video continuation tasks, equipping it with the ability to generate open-domain video game content.
Second phase: The pre-trained model is frozen and fine-tuned using a trainable InstructNet, enabling the production of subsequent frames based on multimodal structural instructions.

The Motive:

Back in August, it was announced that GameNGen (Google and Tel Aviv University) recreated the game Doom and rendered it in real time. GameNGen is a neural model-based game engine that allows real-time interaction with complex gaming environments without a traditional game engine.

GameNGen using AI to recreate Doom was a huge leap forward for AI videogame creation, and now we have the advancement from Tencent, which is a massive leap forward in just a few months.

It won’t be long before devs step away from real game engines use AI platforms for development, leading the way to AI game engines and personalized gaming experiences.

AI Riddle Answer: Sand.

Mellow Motive

Want to give a shoutout to Mellow Motive, or send us your feedback. Hit us up at [email protected]. Have a wonderful day.

Reply

or to participate.