OpenAI Sora: Distinctions from ChatGPT and Its User Base
7 days ago | 5 Views
OpenAI has officially introduced Sora, its cutting-edge text-to-video generator, now accessible to the public. Similar to how one can produce extensive text by entering a simple prompt in ChatGPT, Sora allows users to create videos through prompt submissions. Additionally, its capabilities extend further; users can upload images to animate them. For instance, envision submitting a vintage photograph of your great-grandfather and witnessing him engage in various activities such as walking or running. This exemplifies the remarkable innovation at play. You may be curious about how Sora differs from Large Language Models (LLMs) like GPT-4, which underpins ChatGPT. What training data does it utilize? Most importantly, is Sora available to everyone? We address these inquiries in the following sections—continue reading for more information.
How Is Sora Different from Text-Based Large Language Models?
Sora is an AI model designed for text-to-video generation, whereas GPT is classified as a Large Language Model. Although they differ in function, their input requirements share some similarities, particularly since GPT-4 is multimodal, allowing it to handle text, images, and videos as input.
For example, when using Sora, one can generate a video depicting a mountain ridge by providing a comprehensive prompt. Users can indicate specific features, such as whether the mountains should be covered in snow or if the sun should be shining, among other details. Additionally, Sora can animate an existing image by processing it, effectively transforming text, images, or videos into video outputs.
In contrast, GPT models are limited to producing text outputs, regardless of whether the inputs are text or images. This distinction in output capabilities differentiates the two models.
One might wonder if ChatGPT or Google Gemini can create images, given that they are based on GPT-4 and Gemini, respectively. The answer is negative; these models utilize separate systems like DALL-E 3 (from OpenAI) and Imagen 3 (from Google) for generating images from text.
Moreover, Sora possesses the unique capability to extend videos temporally, either forward or backward, thereby enhancing its functionality.
How Was Sora Trained?
OpenAI asserts that Sora has been trained using videos and images of diverse durations, resolutions, and aspect ratios. The organization indicates that it utilizes a Transformer architecture, which analyzes space-time patches of video and image latent codes.
From a technical standpoint, there exists a notable distinction in the methodologies employed. While text-to-video models like Sora undergo a different training process, large language models such as GPT-4o or other AI systems from companies like Meta's Llama are generally trained on what are known as tokens.
In contrast to tokens, OpenAI adopts a technique referred to as visual patches for the training of Sora. This process involves segmenting videos into patches by compressing them into a lower-dimensional latent space. Subsequently, the representation is further decomposed into space-time patches for additional processing.
Who Is Sora Available For?
At present, Sora is unavailable to users without a subscription. To gain access, one must purchase either the OpenAI Plus subscription or the OpenAI Pro subscription.
The Plus subscription, priced at ₹2,000 in India, permits users to generate up to 50 Sora videos each month. Alternatively, the Pro subscription, which costs $200, allows for the generation of up to 500 videos at a faster rate. However, selecting a higher resolution will decrease the total number of videos that can be generated. For those who prefer a more leisurely approach, utilizing the slower generation mode provides unlimited video generation options.
It is essential to be aware that there is a resolution limit, with videos capped at a maximum length of 20 seconds. Users can choose from widescreen, vertical, or square aspect ratios.
Specifically, OpenAI Plus (or ChatGPT Plus) subscribers can create videos at a resolution of 480p, with a maximum of 50 videos. While 720p generation is an option, it results in a reduced number of available videos.
Due to significant demand, OpenAI is currently not accepting new signups for Sora. OpenAI's CEO, Sam Altman, has indicated that signups have been temporarily halted but will recommence once demand levels off. He has assured users that OpenAI is actively working to resolve these challenges as swiftly as possible. However, it may take some time before all users can access Sora. If access is delayed, it is a result of the high demand.
Read Also: Samsung One UI 7 vs. iOS 18: Key Similarities and Upcoming Features
HOW DID YOU LIKE THIS ARTICLE? CHOOSE YOUR EMOTICON!
#