{"id":12028,"date":"2024-05-01T15:38:03","date_gmt":"2024-05-01T15:38:03","guid":{"rendered":"https:\/\/matchboxsoftware.com\/blog\/?p=12028"},"modified":"2024-05-01T15:38:05","modified_gmt":"2024-05-01T15:38:05","slug":"text-to-video-generator-vidu-chinas-sora","status":"publish","type":"post","link":"https:\/\/matchboxsoftware.com\/blog\/text-to-video-generator-vidu-chinas-sora\/","title":{"rendered":"Extraordinary Text to Video Generator Vidu: China\u2019s Reply to OpenAI\u2019s Sora [2024]"},"content":{"rendered":"<div role=\"navigation\" aria-label=\"Table of Contents\" class=\"simpletoc wp-block-simpletoc-toc\"><h2 style=\"margin: 0;\"><button type=\"button\" aria-expanded=\"false\" aria-controls=\"simpletoc-content-container\" class=\"simpletoc-collapsible\">Table of Contents<span class=\"simpletoc-icon\" aria-hidden=\"true\"><\/span><\/button><\/h2><div id=\"simpletoc-content-container\" class=\"simpletoc-content\"><style>html { scroll-behavior: smooth; }<\/style><ul class=\"simpletoc-list\">\n<li><a href=\"#introduction\">Introduction:<\/a>\n\n<\/li>\n<li><a href=\"#the-birth-of-text-to-video-generator-vidu\">The Birth of Text to Video Generator Vidu<\/a>\n\n<\/li>\n<li><a href=\"#the-vidu-experience\">The Vidu Experience<\/a>\n\n<\/li>\n<li><a href=\"#the-technical-marvel-universal-vision-transformer-uvit\">The Technical Marvel: Universal Vision Transformer (U-ViT)<\/a>\n\n<\/li>\n<li><a href=\"#the-quest-for-selfreliant-innovation\">The Quest for Self-Reliant Innovation<\/a>\n\n<\/li>\n<li><a href=\"#the-challenge-of-computing-power\">The Challenge of Computing Power<\/a>\n\n<\/li>\n<li><a href=\"#chinese-vidu-vs-sora-the-battle-continues\">Chinese Vidu vs. Sora: The Battle Continues<\/a>\n\n<\/li>\n<li><a href=\"#conclusion\">Conclusion:<\/a>\n\n<\/li>\n<li><a href=\"#frequently-asked-questions-faqs\">Frequently Asked Questions (FAQs):<\/a>\n<\/li><\/ul><\/div><\/div>\n\n<h2 class=\"wp-block-heading\" id=\"introduction\"><strong>Introduction:<\/strong><\/h2>\n\n\n<p>In the ever-evolving landscape of artificial intelligence, China has once again stepped into the spotlight with its latest innovation: <strong>Text to Video Generator Vidu<\/strong>. Developed by Chinese startup <strong>Shengshu Technology<\/strong> in collaboration with <strong>Tsinghua University<\/strong>, Vidu is poised to be a formidable competitor to OpenAI\u2019s renowned text-to-video generator, <strong>Sora<\/strong>.<\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"the-birth-of-text-to-video-generator-vidu\"><strong>The Birth of Text to Video Generator Vidu<\/strong><\/h2>\n\n\n<p>At the <strong>Zhongguancun Forum<\/strong> in Beijing, Shengshu Technology and Tsinghua University unveiled Vidu, a powerful AI-driven text-to-video app. While Sora boasts the ability to create 60-second videos, Text to Video Generator Vidu focuses on brevity, generating 16-second clips at <strong>1080p resolution<\/strong> with a single click. Although shorter in duration, Vidu represents the pinnacle of China\u2019s current capabilities in this domain.<\/p>\n\n\n\n<p>Also Read: <a href=\"https:\/\/matchboxsoftware.com\/blog\/mora-ai-open-source-alternative-to-sora\/\">Mora AI Open Source Alternative to Sora AI(2024)<\/a><\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"the-vidu-experience\"><strong>The Vidu Experience<\/strong><\/h2>\n\n\n<p>Vidu\u2019s magic lies in its simplicity. Users input text prompts, and within moments, the app weaves them into visually captivating videos. Imagine a panda strumming a guitar on a grassy field or a playful puppy frolicking in a pool. Vidu brings these scenes to life, maintaining consistent characters, settings, and timelines.<\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"the-technical-marvel-universal-vision-transformer-uvit\"><strong>The Technical Marvel: Universal Vision Transformer (U-ViT)<\/strong><\/h2>\n\n\n<p>Behind Vidu\u2019s wizardry lies the <strong>Universal Vision Transformer (U-ViT)<\/strong>, a self-developed visual transformation model architecture. This innovative framework seamlessly integrates two text-to-video AI models: <strong>Diffusion<\/strong> and <strong>Transformer<\/strong>. The result? Realistic videos replete with dynamic camera movements, expressive facial features, and natural lighting and shadows.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Meet Vidu, A New Chinese Text to Video AI Model\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/u1R-jxDPC70?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n<h2 class=\"wp-block-heading\" id=\"the-quest-for-selfreliant-innovation\"><strong>The Quest for Self-Reliant Innovation<\/strong><\/h2>\n\n\n<p>Zhu Jun, chief scientist at Shengshu and deputy dean at Tsinghua\u2019s Institute for AI, proudly describes Vidu as \u201cthe latest achievement of self-reliant innovation.\u201d The breakthroughs achieved by <a href=\"https:\/\/www.vidu.io\/text-to-video-ai\" data-type=\"link\" data-id=\"https:\/\/www.vidu.io\/text-to-video-ai\" target=\"_blank\" rel=\"noopener\">Text to Video Generator Vidu<\/a> are manifold, making it a significant milestone in China\u2019s AI journey. Moreover, Chinese Vidu\u2019s ability to comprehend \u201cChinese elements\u201d adds a unique touch, catering to its local audience.<\/p>\n\n\n\n<p>Also Read: <a href=\"https:\/\/matchboxsoftware.com\/blog\/sora-ai-video-generator-tool\/\">\u201cSora AI Video Generator Tool\u201d: Bridging Text to Video Creativity<\/a><\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"the-challenge-of-computing-power\"><strong>The Challenge of Computing Power<\/strong><\/h2>\n\n\n<p>While Chinese Vidu marks a significant leap forward, it\u2019s essential to acknowledge the challenges faced by Chinese AI developers. Sora, for instance, demands a hefty computing infrastructure\u2014specifically, <strong>eight NVIDIA A100 graphics processing units (GPUs)<\/strong>\u2014to churn out a mere one-minute video clip. The scarcity of such computing power has hindered Chinese companies from matching Sora\u2019s prowess until now.<\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"chinese-vidu-vs-sora-the-battle-continues\"><strong>Chinese Vidu vs. Sora: The Battle Continues<\/strong><\/h2>\n\n\n<p>Unlike the plethora of Chinese imitations that followed OpenAI\u2019s ChatGPT release in 2020, Sora remained unchallenged\u2014until Text to Video Generator Vidu emerged. The race for supremacy in text-to-video generation continues, fueled by determination and technical prowess. As the world watches, Vidu and Sora lock horns, each vying for the title of the ultimate text-to-video champion.<\/p>\n\n\n\n<p>Also Read: <a href=\"https:\/\/matchboxsoftware.com\/blog\/devika-open-source-ai-software-engineer\/\">Meet Devika Open-Source AI Software Engineer Bridging the Gap(2024)<\/a><\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\"><strong>Conclusion:<\/strong><\/h2>\n\n\n<p>Text to Video Generator Vidu\u2019s arrival signifies China\u2019s unwavering commitment to AI innovation. As the global AI landscape evolves, we eagerly await the next chapter in this enthralling saga of technological rivalry.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n<h2 class=\"wp-block-heading\" id=\"frequently-asked-questions-faqs\"><strong>Frequently Asked Questions (FAQs):<\/strong><\/h2>\n\n\n<p>Here are <strong>five frequently asked questions (FAQs)<\/strong> about <strong>Text to Video Generator Vidu<\/strong>, China\u2019s response to OpenAI\u2019s text-to-video generator, Sora:<\/p>\n\n\n<div id=\"rank-math-faq\" class=\"rank-math-block\">\n<div class=\"rank-math-list \">\n<div id=\"faq-question-1714574893571\" class=\"rank-math-list-item\">\n<h4 class=\"rank-math-question \"><strong>What is Vidu?<\/strong><\/h4>\n<div class=\"rank-math-answer \">\n\n<p>Vidu is an innovative <strong>text-to-video AI model<\/strong> developed by Chinese startup <strong>Shengshu Technology<\/strong> in collaboration with <strong>Tsinghua University<\/strong>. It allows users to transform text prompts into visually captivating videos with a single click.<br \/>While shorter in duration than Sora, Text to Video Generator Vidu generates 16-second clips at <strong>1080p resolution<\/strong> and represents China\u2019s current best in this domain.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1714574921152\" class=\"rank-math-list-item\">\n<h4 class=\"rank-math-question \"><strong>How does Vidu work?<\/strong><\/h4>\n<div class=\"rank-math-answer \">\n\n<p>Vidu\u2019s magic lies in its simplicity. Users input text prompts, and the app weaves them into engaging videos. Whether it\u2019s a panda strumming a guitar on a grassy field or a playful puppy swimming in a pool, Vidu brings these scenes to life with consistent characters, settings, and timelines.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1714574936005\" class=\"rank-math-list-item\">\n<h4 class=\"rank-math-question \"><strong>What makes Vidu unique?<\/strong><\/h4>\n<div class=\"rank-math-answer \">\n\n<p>Vidu is built on a self-developed visual transformation model architecture called the <strong>Universal Vision Transformer (U-ViT)<\/strong>. This architecture seamlessly integrates two text-to-video AI models: <strong>Diffusion<\/strong> and <strong>Transformer<\/strong>. The result? Realistic videos with dynamic camera movements, expressive facial features, and natural lighting and shadows.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1714574949946\" class=\"rank-math-list-item\">\n<h4 class=\"rank-math-question \"><strong>How does Vidu compare to Sora?<\/strong><\/h4>\n<div class=\"rank-math-answer \">\n\n<p>Vidu aims to rival OpenAI\u2019s Sora. While Sora demands significant computing power (eight NVIDIA A100 GPUs) to create one-minute video clips, Vidu focuses on brevity, producing 16-second videos. Vidu\u2019s arrival signifies China\u2019s commitment to self-reliant AI innovation, and it adds a unique touch by comprehending \u201cChinese elements\u201d in its videos.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1714574964360\" class=\"rank-math-list-item\">\n<h4 class=\"rank-math-question \"><strong>What challenges does Vidu face?<\/strong><\/h4>\n<div class=\"rank-math-answer \">\n\n<p>Despite its breakthroughs, Vidu faces the obstacle of inadequate computing power. Sora\u2019s computing demands have hindered Chinese companies from matching its prowess until now. However, Vidu\u2019s emergence signals a new chapter in the enthralling saga of technological rivalry between text-to-video generators.<\/p>\n\n<\/div>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Introduction: In the ever-evolving landscape of artificial intelligence, China has once again stepped into the spotlight with its latest innovation: Text to Video Generator Vidu. Developed by Chinese startup Shengshu Technology in collaboration with Tsinghua University, Vidu is poised to be a formidable competitor to OpenAI\u2019s renowned text-to-video generator, Sora. The Birth of Text to<\/p>\n<div class=\"read-more-section\"><a class=\"button\" href=\"https:\/\/matchboxsoftware.com\/blog\/text-to-video-generator-vidu-chinas-sora\/\">Continue Reading &rarr;<\/a><\/div>\n","protected":false},"author":1,"featured_media":12036,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_eb_attr":"","footnotes":""},"categories":[64,84],"tags":[294,291,293,292],"class_list":["post-12028","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-ai-powered","tag-chinese-sora-atlernative-vidu","tag-text-to-video-generator-vidu","tag-vidu","tag-vidu-text-to-video-generator"],"_links":{"self":[{"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/12028"}],"collection":[{"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/comments?post=12028"}],"version-history":[{"count":9,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/12028\/revisions"}],"predecessor-version":[{"id":12039,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/12028\/revisions\/12039"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/media\/12036"}],"wp:attachment":[{"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/media?parent=12028"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/categories?post=12028"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/matchboxsoftware.com\/blog\/wp-json\/wp\/v2\/tags?post=12028"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}