共计 6244 个字符,预计需要花费 16 分钟才能阅读完成。
youtube频道才建立3个月时间,发了50个视频,现在已经超过2亿次观看。反常的是,这个频道的数据虽然夸张,内容却一点都不新,可能你很早就看过这些题材。既没创造新选题,也没用学不会的技术,说到底就是站在了一条已经被验证过的路径上。

下面简单的拆解做类似频道的技术方法以及一些有用的策略和思路。完整的提示词和AI生成参数请到文章的末尾查看和免费获取。⤵

复制第一个提示词,粘贴到ChatGPT里生成内容。它会给你10个自带流量、勾人好奇心的视频选题,比如“人能多久不睡觉” “天天不睡觉会怎么样”“人能多久不喝水”,这些选题都是经过验证的,自带吸睛体质,能直接勾着观众想知道答案。

比如我选择一个喜欢的话题:“天天不睡觉会怎么样”,你只要输入你喜欢的选题编号,ChatGPT就能生成一篇超有吸引力的视频脚本。

接着复制第二个提示词,粘贴到ChatGPT里,再把刚才生成的脚本也粘进去,点击生成。你会得到每个镜头对应的、超详细的图片和视频生成提示词。


这些提示词比普通的长很多,但正是这些细节,能让AI生成更真实、更专业的画面和视频。接下来打开Higsfield网站,点击左上角的图片选项,再点创建图片,进入图片生成界面。先选个生成模型,比如Nano Banana Pro,再选好图片的比例和画质,把ChatGPT生成的第一个图片提示词粘进去,点击生成,几秒就能得到一张符合要求的图片。

你也可以试试换个模型,比如Seedream 4.5,同样设置好参数,粘贴提示词生成图片,对比下来我还是觉得Nano Banana Pro的效果更好,所以后面的图片我都用这个模型生成。把所有图片提示词都按这个流程操作一遍,就能得到脚本里所有镜头的图片了。

图片都生成好后,就到了最关键的视频生成环节。点击网站上的创建视频,选择Kling 3.0 模型,上传你生成的第一张图片,再把ChatGPT里对应的第一个视频提示词粘进去

选好视频时长和比例,点击生成,几秒就能得到一段动态视频。

然后上传第二张图片,复制对应的第二个视频提示词,参数保持不变,点击生成,又能得到一段匹配的视频。就这么重复操作,把所有图片都转换成对应的视频片段。

视频片段都弄好后,打开ElevenLabs生成配音,把脚本粘贴进去,选一个低沉清晰的声音,这种声音做科普类内容效果最好。

点击生成后,把配音下载下来。最后打开任意视频剪辑软件,导入所有视频片段和配音,把它们拖到时间轴上,根据配音的节奏调整视频片段的时长和顺序,剪完之后,一条完整的骨骼科普视频就做好了。

具体视频就不放了,可以到上面的频道去查看,更值得学习。这就是用AI做爆款骨骼科普视频的完整流程,具体应用可以微调优化。这个流程的厉害之处在于,不用换一堆工具,大部分操作都在Higsfield(或者其他的AI工具平台)里完成,用Kling 3.0做视频引擎。Kling 3.0 Zero能生成更流畅的动作、更真实的物理效果,还能生成最长15秒的视频片段,让整个视频的衔接更自然流畅,这对科普内容来说特别重要,毕竟内容的连贯性和清晰度直接影响观众的观感。
PROMPT 1:
You are writing narration for a viral YouTube Shorts channel that explains human limits and biological failure.
REFERENCE STYLE (STRICT):
● Calm
● Clinical but conversational (NOT academic)
● Slightly ominous
● Second-person (“you”)
● Short sentences
● Simple language
● Everyday comparisons
● No advice, no warnings, no disclaimers
ABSOLUTE LANGUAGE RESTRICTIONS
● You are NOT allowed to:
● Use medical jargon
● Name diseases or diagnoses
● Describe internal processes the viewer cannot feel
● Sound like a textbook or research paper
● Explain mechanisms in detail
Instead:
● Describe what the person notices
● Describe what starts failing externally
● Use comparisons to familiar states (fatigue, intoxication, machines, loss of control)
PHASE 1: IDEA GENERATION
Generate 10 short-form video ideas using:
● “How Long Can You __?”
● “What Happens If You __ Every Day?”
● “How Much __ Is TOO Much?”
Rules:
● Human body or brain only
● Escalation over time
● Visually explainable
● Slightly dangerous
● Grounded in real life
For each idea:
● Title
● One-sentence failure path written in simple language
Ask the user to choose ONE idea by number.
Stop.
PHASE 2: SCRIPT GENERATION
Write a 45–70 second script using this structure:
STRUCTURE:
● Opening question (1 sentence)
● Time checkpoints (Hour / Day / Week / Month / Year)
● At each checkpoint include:
● What you physically feel
● What you mentally notice
● One familiar comparison (drunk, exhausted, machine overheating, signal loss)
● One sudden realization moment (memory gap, loss of awareness, loss of control)
● Final irreversible failure
● End visually and abruptly
STYLE RULES:
● Plain language
● No disease names
● No lab terms
● No abstract biology
● Every line must be easy to imagine visually Output ONLY the script.
PROMPT 2: Image Prompts
You are an AI video director and prompt engineer creating photorealistic, high-quality visuals for a viral short-form video.
Your task is to convert a narration script into scene-by-scene IMAGE PROMPTS and IMAGE-TO-VIDEO PROMPTS with strict visual consistency.
| INPUT |
| Video Script: [PASTE SCRIPT HERE] |
ABSOLUTE VISUAL ANCHOR (NON-NEGOTIABLE)
ALL scenes MUST use the SAME anatomical character design described below. Only the POSE, BODY POSITION, and ENVIRONMENT may change.
MAIN CHARACTER — HARD LOCK
For EVERY scene, the character MUST be described EXACTLY as follows (do NOT shorten, summarize, or reference indirectly):
A full-body realistic humanoid SKELETON character with a semi-transparent human- shaped outer body shell.
The character has:
● A fully exposed skull (NO skin, NO face, NO muscles)
● Clean, smooth, anatomically accurate skull
● Large, round eye sockets with visible eyeballs
● Bright yellow irises with dark pupils
● Neutral to slightly vacant expression
● Visible upper and lower teeth
● Smooth cranium with no cracks, damage, decay, or horror elements
The body is a semi-transparent, glass-like human silhouette that clearly reveals the entire internal skeletal structure from head to toe.
Skeleton details:
● Ivory / pale beige bones
● Smooth, medical-grade surfaces
● Accurate human proportions
● Clearly defined rib cage, spine, pelvis, arms, hands, legs, knees, ankles, and feet
● All joints, vertebrae, and phalanges visible and anatomically correct
No muscles.
No veins.
No organs.
No skin texture.
The style is:
● High-end medical visualization
● Clean, clinical, modern
● NOT horror
● NOT zombie
● NOT cartoon
● NOT decayed
POSE & ACTION RULE
The character’s POSE, BODY POSITION, and GESTURE MUST change per scene to match the script.
Examples:
● Sitting on bed scrolling phone
● Rubbing head in confusion
● Walking slowly
● Slumped posture
● Dropping an object
● Collapsing into a chair
DO NOT keep a neutral standing pose unless the script explicitly implies it.
ENVIRONMENT RULE
For EACH scene:
● Infer the environment directly from the script
● Place the skeleton character naturally inside that environment
● Environment must be realistic and context-appropriate
Examples:
● Bedroom → skeleton sitting or lying on bed
● Of f i ce s kel et on at des k
● St r eet s kel et on wal ki ng
● Chai r s kel et on s l umped
NO fixed white background unless the script implies a studio or medical lab.
CAMERA LOCK
● Medium or medium-wide shots only
● Eye-level or chest-level camera
● No extreme angles
● No dramatic lens changes
● Same framing logic across scenes
LIGHTING & REALISM
● Real-world lighting matching the environment
● Natural shadows
● Subtle reflections on transparent body shell
● Photorealistic cinematic realism
● Clean medical look
● NOT stylized
● NOT exaggerated
TASK 1: SCENE BREAKDOWN
Break the script into scenes by time or event.
For each scene, specify:
● Scene number
● Time checkpoint
● Environment
● Pose / action change from previous scene
TASK 2: IMAGE PROMPTS
For EACH scene, generate a FULL, STANDALONE IMAGE PROMPT that includes:
1. FULL character description (repeated verbatim)
2. Scene-specific environment
3. Scene-specific pose and body language
4. Camera framing
5. Lighting
6. Mood
7. Realism constraints
DO NOT say “same character” . DO NOT shorten descriptions.
TASK 3: IMAGE-TO-VIDEO PROMPTS
For EACH scene, generate an image-to-video prompt describing:
● Subtle body movement
● Minimal natural motion
● Environmental motion
● Very slight camera drift only
No fast movement.
No animation feel.
Everything must feel real and continuous.
OUTPUT FORMAT
Scene X:
● Time checkpoint:
● Environment:
● Image prompt:
● Image-to-video prompt:
No explanations.
No commentary.
Production-ready output only.