Devlog: Youtube Transcript and Summary Python Code

in STEMGeeks4 days ago

Disclaimer: This devlog was partly generated by AI. Of course, I've gone over it for accuracy and personality.



Devlog: Creating a YouTube Summarizer Using NanoGPT

Over the past week, I worked on perfecting my workflow to put summaries of Youtube's videos on the blockchain. I'm not the first one who attempts this, as @taskmaster4450le is a big advocate of filling this blockchain with data. And @mightpossibly has already made a summarization bot that you could pay to access.

My workflow and the resulting summaries are a bit different from his, though. At first, it was a manual process, but I started automating parts of it, and I finally reached a complete version.

Below are the codes I had made with ChatGPT's help before starting to combine their functionalities into one automatic script.

Background 1: Fetching Transcriptions

yt_transcript.py:

import sys
import re
import os
from youtube_transcript_api import YouTubeTranscriptApi
from bs4 import BeautifulSoup
import requests

def extract_video_id(url):
    regex = r'(?:https?://)?(?:www\.)?(?:youtube\.com/(?:[^/]+/)?(?:v|e|embed|watch|shorts)/|youtu\.be/)([a-zA-Z0-9_-]{11})'
    match = re.search(regex, url)
    
    if match:
        return match.group(1)
    return None

def get_video_url():
    if len(sys.argv) > 1:
        for arg in sys.argv[1:]:
            if not arg.startswith('-'):
                return arg
    
    video_url = input("No video URL provided. Please enter a YouTube video URL: ")
    return video_url

def fetch_video_title(video_id):
    try:
        print(f"Fetching video title for video ID '{video_id}'")
        url = f"https://www.youtube.com/watch?v={video_id}"
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        title = soup.find("title").text
        # Remove " - YouTube" from the end of the title if present
        if title.endswith(" - YouTube"):
            title = title[:-10]
        return title
    except Exception as e:
        print(f"Error fetching video title: {e}")
        return None

def fetch_transcript(video_id):
    try:
        print(f"Fetching transcript for video ID '{video_id}'")
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        print("Transcript fetched successfully.")
        
        transcript_text = "\n".join([entry['text'] for entry in transcript])
        return transcript_text
    except Exception as e:
        print(f"Error: {e}")
        return None

def sanitize_filename(name):
    return re.sub(r'[^a-zA-Z0-9-_]', '-', name).strip('-')

def save_transcript(title, transcript):
    sanitized_title = sanitize_filename(title)
    filename = f"{sanitized_title}.txt"
    with open(filename, 'w', encoding='utf-8') as f:
        f.write(f"{title}\n\nTranscript:\n{transcript}")
    print(f"Transcript saved to '{filename}'")

if __name__ == "__main__":
    video_url = get_video_url()
    video_id = extract_video_id(video_url)

    if not video_id:
        print("Error: Invalid YouTube URL. Unable to extract video ID.")
    else:
        title = fetch_video_title(video_id)
        if not title:
            print("Error: Unable to fetch video title.")
        else:
            transcript = fetch_transcript(video_id)
            if transcript:
                save_transcript(title, transcript)

Background 2: Prompting the AI's API

For this code, I'm using NanoGPT API which allows me to communicate with many Large Language Models, including llama 3.3 70B which I'll be using and pay for that using the $NANO cryptocurrency.

nanogpt.py:

import requests
import json
import os
import argparse
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()
API_KEY = os.getenv("API_KEY")

# Constants
BASE_URL = "https://nano-gpt.com/api"
OUTPUT_FILE = "response_output.txt"

headers = {
    "x-api-key": API_KEY,
    "Content-Type": "application/json"
}

def talk_to_gpt(prompt, system_prompt=None, model="llama-3.3-70b", messages=[]):
    """
    Send a prompt to the NanoGPT API.

    Args:
        prompt (str): The main user prompt.
        system_prompt (str, optional): The system-level prompt for setting context. Defaults to None.
        model (str, optional): The model to use. Defaults to "llama-3.3-70b".
        messages (list, optional): Conversation history messages. Defaults to [].

    Returns:
        str: The API response, or None if there was an error.
    """
    if system_prompt:
        messages.insert(0, {"role": "system", "content": system_prompt})

    data = {
        "prompt": prompt,
        "model": model,
        "messages": messages
    }

    try:
        response = requests.post(f"{BASE_URL}/talk-to-gpt", headers=headers, json=data)
        if response.status_code == 200:
            return response.text
        else:
            print(f"Error {response.status_code}: {response.text}")
            return None
    except requests.RequestException as e:
        print("An error occurred:", e)
        return None

def save_response_to_file(response):
    """
    Save the response to a text file, ensuring Unicode characters are written correctly.
    
    Args:
        response (str): The response to save.
    """
    with open(OUTPUT_FILE, "w", encoding="utf-8") as file:
        file.write(response)

def main():
    """
    Parse command-line arguments and interact with the NanoGPT API.
    """
    # Argument parsing
    parser = argparse.ArgumentParser(description="Interact with the NanoGPT API.")
    parser.add_argument("-p", "--prompt", type=str, help="The main prompt to send to the GPT model.")
    parser.add_argument("-s", "--system", type=str, help="The system-level prompt to set context (optional).")
    parser.add_argument("-m", "--model", type=str, default="llama-3.3-70b", help="The model to use (default: llama-3.3-70b).")
    args = parser.parse_args()

    # Check if a prompt is provided; if not, ask the user
    prompt = args.prompt or input("Enter your prompt: ")
    system_prompt = args.system
    model = args.model

    # Example messages (modify as needed)
    messages = [
        {"role": "user", "content": "I'll provide you with the transcript video now."},
        {"role": "assistant", "content": "Please go ahead and share the transcript of the video. I'll be happy to assist you with anything you need."}
    ]

    # Call the function
    response = talk_to_gpt(prompt, system_prompt=system_prompt, model=model, messages=messages)
    if response:
        # Split the response to separate the text and NanoGPT info
        parts = response.split('<NanoGPT>')
        text_response = parts[0].strip()

        # Decode Unicode characters properly
        decoded_text_response = json.loads(f'"{text_response}"')

        # Extract the NanoGPT info
        try:
            nano_info = json.loads(parts[1].split('</NanoGPT>')[0])

            # Combine and format the final output as JSON
            final_output = {
                "response_text": decoded_text_response,
                "nano_gpt_info": nano_info
            }

            # Pretty print the JSON and save to a .txt file
            formatted_output = json.dumps(final_output, indent=4, ensure_ascii=False)
            print("Formatted Output:", formatted_output)
            save_response_to_file(formatted_output)

            print(f"Response saved to {OUTPUT_FILE}")
        except (IndexError, json.JSONDecodeError):
            print("Failed to parse NanoGPT info.")
            save_response_to_file(f"Response: {decoded_text_response}\n\nError parsing NanoGPT info.")
    else:
        print("Failed to get response from GPT")

# Ensure the script can be imported and used elsewhere
if __name__ == "__main__":
    main()


The App Devlog

Step 1: Setting the Foundation

I wanted to build a script that could fetch YouTube transcripts, send them to NanoGPT, and generate a meaningful summary. The outputs should be in a neat Markdown format.

So, the main function of the script takes a YouTube URL, fetches its transcript, sends it to NanoGPT, and saves the result.

Code Snippet:

video_id = extract_video_id(video_url)
transcript = fetch_transcript(video_id)
response = talk_to_gpt(transcript)
print(response)

Step 2: Dynamic File Naming

The next step was to save summaries with filenames based on the YouTube video title. The title should be sanitized of the special characters first, though. This prevents overwriting and makes files easy to identify.

Milestone:

Each output file is named dynamically with the format Summary-{SanitizedTitle}.md.

Code Snippet:

sanitized_title = sanitize_filename(title)
filename = f"Summary-{sanitized_title}.md"
with open(filename, 'w', encoding='utf-8') as f:
    f.write(summary)

Step 3: Enhanced Help Messages

To make the script user-friendly, I added -h and --help flags with detailed usage instructions. This ensures clarity for anyone using the script.

Milestone:

The script now supports --help to explain all the arguments.

Code Snippet:

parser = argparse.ArgumentParser(
    description="Fetch YouTube transcript and generate summary using NanoGPT."
)
parser.add_argument("-u", "--url", type=str, help="The YouTube video URL.")
parser.add_argument("-p", "--prompt", type=str, help="The main prompt.")

Step 4: Default System Prompt

A default system prompt was added to guide the summarization process. If a custom prompt is provided, it takes priority.

**Milestone:
The script uses a high-quality default prompt but allows custom overrides.

Code Snippet:

DEFAULT_SYSTEM_PROMPT = """Your output should use the following template:

"### Title

One_Paragraph

### Heading 1
- ✅ Bulletpoint
- ✅ Bulletpoint
etc

### Heading 2
- ✅ Bulletpoint
- ✅ Bulletpoint
etc"

Below is a transcript from a Youtube video. Clean the transcript's text, then write a summary from the video content.

You Must:
- Generate a title for the summary, especially if the video title was misleading.
- Write a high-quality one-paragraph summary in 120 words or less.
- Convert the transcript into bullet points, maintaining all the key information.
- Each bullet point should be concise, focusing on one main idea. Expand on it a bit when appropriate.
- Every bullet point starts with a suitable emoji (to replace ✅) based on its text.
- Categorize bullet points that follow the same topic under an appropriately titled subheading, ordered by their mention order from the video's transcript.
- If an idea linked to a time/date in the video, mention the time and date.
- If an idea was reinforced with an example, mention the example.
- Be as similar as possible to the video's voice and manner of speech. Avoid redundant language."""

if args.system:
    system_prompt = args.system
else:
    system_prompt = DEFAULT_SYSTEM_PROMPT

Step 5: Human-Readable JSON Output

I improved the NanoGPT response formatting by displaying the JSON part in a human-readable way, both in the terminal and the Markdown file. The AI model outputs emoji, which were harder to make compatible with the console.

Milestone:
JSON outputs are now neatly indented for better readability.

Code Snippet:

formatted_json = json.dumps(nano_info, indent=4)
f.write(f"\n\n<NanoGPT>\n{formatted_json}\n</NanoGPT>")

Step 6: Structuring the Final Output

Finally, I structured the output to include the video title and link at the top, followed by the summary, with clear separations.

Code Snippet:

f.write(f"Title: {title}\nLink: {video_url}\n\n{summary}\n\n\n\n{transcript}")

Full Code

yt_ai_summary.py:

import sys
import re
import os
from youtube_transcript_api import YouTubeTranscriptApi
from bs4 import BeautifulSoup
import requests
import argparse
from dotenv import load_dotenv
import json

# Load API key from .env file
load_dotenv()
API_KEY = os.getenv("API_KEY")

# Constants for NanoGPT API
BASE_URL = "https://nano-gpt.com/api"
OUTPUT_FILE = "response_output.txt"

headers = {
    "x-api-key": API_KEY,
    "Content-Type": "application/json"
}

# Default system prompt
DEFAULT_SYSTEM_PROMPT = """Your output should use the following template:

"### Title

One_Paragraph

### Heading 1
- ✅ Bulletpoint
- ✅ Bulletpoint
etc

### Heading 2
- ✅ Bulletpoint
- ✅ Bulletpoint
etc"

Below is a transcript from a Youtube video. Clean the transcript's text, then write a summary from the video content.

You Must:
- Generate a title for the summary, especially if the video title was misleading.
- Write a high-quality one-paragraph summary in 120 words or less.
- Convert the transcript into bullet points, maintaining all the key information.
- Each bullet point should be concise, focusing on one main idea. Expand on it a bit when appropriate.
- Every bullet point starts with a suitable emoji (to replace ✅) based on its text.
- Categorize bullet points that follow the same topic under an appropriately titled subheading, ordered by their mention order from the video's transcript.
- If an idea linked to a time/date in the video, mention the time and date.
- If an idea was reinforced with an example, mention the example.
- Be as similar as possible to the video's voice and manner of speech. Avoid redundant language."""

def extract_video_id(url):
    regex = r'(?:https?://)?(?:www\.)?(?:youtube\.com/(?:[^/]+/)?(?:v|e|embed|watch|shorts)/|youtu\.be/)([a-zA-Z0-9_-]{11})'
    match = re.search(regex, url)
    if match:
        return match.group(1)
    return None

def fetch_video_title(video_id):
    try:
        print(f"Fetching video title for video ID '{video_id}'")
        url = f"https://www.youtube.com/watch?v={video_id}"
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        title = soup.find("title").text
        if title.endswith(" - YouTube"):
            title = title[:-10]
        return title
    except Exception as e:
        print(f"Error fetching video title: {e}")
        return None

def fetch_transcript(video_id):
    try:
        print(f"Fetching transcript for video ID '{video_id}'")
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        print("Transcript fetched successfully.")
        transcript_text = "\n".join([entry['text'] for entry in transcript])
        return transcript_text
    except Exception as e:
        print(f"Error: {e}")
        return None

def sanitize_filename(name):
    return re.sub(r'[^a-zA-Z0-9-_]', '-', name).strip('-')

def save_summary_to_file(summary, transcript, title, video_url, nano_info):
    sanitized_title = sanitize_filename(title)
    filename = f"Summary-{sanitized_title}.md"
    with open(filename, 'w', encoding='utf-8') as f:
        # Write the title and video link
        f.write(f"### Title:\n{title}\n")
        f.write(f"### Link:\n{video_url}\n\n")  # 2 new lines separating the content

        # Write the summary and transcript with 10 empty lines in between
        f.write(f"{summary}\n\n" + "\n" * 2 + f"\n{transcript}")
        
        # Write the formatted NanoGPT response in human-readable JSON
        f.write(f"\n\n<NanoGPT>\n{json.dumps(nano_info, indent=4)}\n</NanoGPT>")
        
    print(f"Summary saved to '{filename}'")

def talk_to_gpt(prompt, system_prompt=None, model="llama-3.3-70b", messages=[]):
    if system_prompt:
        messages.insert(0, {"role": "system", "content": system_prompt})

    data = {
        "prompt": prompt,
        "model": model,
        "messages": messages
    }

    try:
        response = requests.post(f"{BASE_URL}/talk-to-gpt", headers=headers, json=data)
        if response.status_code == 200:
            return response.text
        else:
            print(f"Error {response.status_code}: {response.text}")
            return None
    except requests.RequestException as e:
        print("An error occurred:", e)
        return None

def main():
    # Argument parsing with help description
    parser = argparse.ArgumentParser(
        description="Fetch YouTube transcript and generate summary using NanoGPT.\n\n"
                    "This script fetches the transcript of a YouTube video using its video URL. "
                    "The transcript is passed as input to the NanoGPT API, which generates a summary. "
                    "The output summary is printed in the terminal and saved to a file, along with the original transcript."
    )
    parser.add_argument("-u", "--url", type=str, help="The YouTube video URL.")
    parser.add_argument("-p", "--prompt", type=str, help="The main prompt to send to the GPT model.")
    parser.add_argument("-s", "--system", type=str, help="The system-level prompt to set context (optional).")
    parser.add_argument("-m", "--model", type=str, default="llama-3.3-70b", help="The model to use (default: llama-3.3-70b).")
    args = parser.parse_args()

    # Get video URL
    video_url = args.url or input("Enter your YouTube video URL: ")
    video_id = extract_video_id(video_url)
    if not video_id:
        print("Error: Invalid YouTube URL. Unable to extract video ID.")
        return

    title = fetch_video_title(video_id)
    if not title:
        print("Error: Unable to fetch video title.")
        return

    transcript = fetch_transcript(video_id)
    if not transcript:
        print("Error: Unable to fetch transcript.")
        return

    # Set system prompt: default or passed in argument
    system_prompt = args.system or DEFAULT_SYSTEM_PROMPT
    prompt = args.prompt or transcript

    # Get response from NanoGPT
    response = talk_to_gpt(prompt, system_prompt=system_prompt, model=args.model)
    if not response:
        print("Error: Unable to get response from NanoGPT.")
        return

    print("NanoGPT response:")
    print(response)

    # Extract NanoGPT info from the response
    try:
        parts = response.split('<NanoGPT>')
        text_response = parts[0].strip()
        # Decode JSON part
        nano_info = json.loads(parts[1].split('</NanoGPT>')[0])

        # Save summary and transcript to file with human-readable JSON
        save_summary_to_file(text_response, transcript, title, video_url, nano_info)
    except (IndexError, json.JSONDecodeError):
        print("Error parsing NanoGPT info.")

if __name__ == "__main__":
    main()


Final Thoughts

The way I asked ChatGPT to co-develop the code with me was incremental. Asking for one feature at a time, and testing it as I go. From fetching transcripts to structuring the output with Markdown and JSON, each step brought the project closer what I envisioned. I hope the final script is user-friendly enough.

What’s next?

Making the code also post these summaries on HIVE is something I'm thinking of doing, but don't know how to best approach it yet. I'd love to add GUI functionality too! But for now, I’m just happy with how far this has come.

Thanks for Reading.~

Posted Using InLeo Alpha