Getting hands-on with real-world AI tasks is one of the best ways to stage up your expertise. However figuring out the place to begin might be difficult, particularly in the event you’re new to AI. Right here, we break down 5 thrilling AI tasks you possibly can implement over the weekend with Python—categorized from newbie to superior. Every mission makes use of a problem-first method to create instruments with real-world purposes, providing a significant approach to construct your expertise.
1. Job Utility Resume Optimizer (Newbie)
Updating your resume for various job descriptions might be time-consuming. This mission goals to automate the method through the use of AI to customise your resume based mostly on job necessities, serving to you higher match recruiters’ expectations.
Steps to Implement:
-
Convert Your Resume to Markdown: Start by making a easy markdown model of your resume.
-
Generate a Immediate: Create a immediate that can enter your markdown resume and the job description and output an up to date resume.
-
Combine OpenAI API: Use the OpenAI API to regulate your resume dynamically based mostly on the job description.
-
Convert to PDF: Use
markdown
andpdfkit
libraries to rework the up to date markdown resume right into a PDF.
Libraries: openai
, markdown
, pdfkit
Code Instance:
import openai
import pdfkit
openai.api_key = "your_openai_api_key"
def generate_resume(md_resume, job_description):
immediate = f"""
Adapt my resume in Markdown format to higher match the job description beneath.
Tailor my expertise and experiences to align with the function, emphasizing related
{qualifications} whereas sustaining knowledgeable tone.
Resume in Markdown:
{md_resume}
Job Description:
{job_description}
Please return the up to date resume in Markdown format.
"""
response = openai.Completion.create(
mannequin="gpt-3.5-turbo",
messages=[{"role": "user", "content": prompt}]
)
return response.decisions[0].textual content
md_resume = "Your markdown resume content material right here."
job_description = "Job description content material right here."
updated_resume_md = generate_resume(md_resume, job_description)
pdfkit.from_string(updated_resume_md, "optimized_resume.pdf")
This mission might be expanded to permit batch processing for a number of job descriptions, making it extremely scalable.
2. YouTube Video Summarizer (Newbie)
Many people save movies to look at later, however not often discover the time to get again to them. A YouTube summarizer can mechanically generate summaries of instructional or technical movies, supplying you with the important thing factors with out the total watch time.
Steps to Implement:
-
Extract Video ID: Use regex to extract the video ID from a YouTube hyperlink.
-
Get Transcript: Use
youtube-transcript-api
to retrieve the transcript of the video. -
Summarize Utilizing GPT-3: Move the transcript into OpenAI’s API to generate a concise abstract.
Libraries: openai
, youtube-transcript-api
, re
Code Instance:
import re
import openai
from youtube_transcript_api import YouTubeTranscriptApi
openai.api_key = "your_openai_api_key"
def extract_video_id(youtube_url):
match = re.search(r'(?:v=|/)([0-9A-Za-z_-]{11}).*', youtube_url)
return match.group(1) if match else None
def get_video_transcript(video_id):
transcript = YouTubeTranscriptApi.get_transcript(video_id)
transcript_text = ' '.be part of([entry['text'] for entry in transcript])
return transcript_text
def summarize_transcript(transcript):
response = openai.Completion.create(
mannequin="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Summarize the following transcript:n{transcript}"}]
)
return response.decisions[0].textual content
youtube_url = "https://www.youtube.com/watch?v=instance"
video_id = extract_video_id(youtube_url)
transcript = get_video_transcript(video_id)
abstract = summarize_transcript(transcript)
print("Abstract:", abstract)
With this device, you possibly can immediately create summaries for a group of movies, saving priceless time.
3. Computerized PDF Organizer by Subject (Intermediate)
When you’ve got a group of analysis papers or different PDFs, organizing them by subject might be extremely helpful. On this mission, we’ll use AI to learn every paper, determine its topic, and cluster related paperwork collectively.
Steps to Implement:
-
Learn PDF Content material: Extract textual content from the PDF’s summary utilizing
PyMuPDF
. -
Generate Embeddings: Use
sentence-transformers
to transform abstracts into embeddings. -
Cluster with Okay-Means: Use
sklearn
to group paperwork based mostly on their similarity. -
Arrange Recordsdata: Transfer paperwork into folders based mostly on their clusters.
Libraries: PyMuPDF
, sentence_transformers
, pandas
, sklearn
Code Instance:
import fitz
from sentence_transformers import SentenceTransformer
from sklearn.cluster import KMeans
import os
import shutil
mannequin = SentenceTransformer('all-MiniLM-L6-v2')
def extract_abstract(pdf_path):
pdf_document = fitz.open(pdf_path)
summary = pdf_document[0].get_text("textual content")[:500]
pdf_document.shut()
return summary
pdf_paths = ["path/to/pdf1.pdf", "path/to/pdf2.pdf"]
abstracts = [extract_abstract(pdf) for pdf in pdf_paths]
embeddings = mannequin.encode(abstracts)
kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(embeddings)
for i, pdf_path in enumerate(pdf_paths):
folder_name = f"Cluster_{labels[i]}"
os.makedirs(folder_name, exist_ok=True)
shutil.transfer(pdf_path, os.path.be part of(folder_name, os.path.basename(pdf_path)))
This organizer might be personalized to research whole libraries of paperwork, making it an environment friendly device for anybody managing giant digital archives.
4. Multimodal Doc Search Instrument (Intermediate)
Key data could also be embedded in each textual content and pictures in technical paperwork. This mission makes use of a multimodal mannequin to allow trying to find data inside textual content and visible information.
Steps to Implement:
-
Extract Textual content and Pictures: Use
PyMuPDF
to extract textual content and pictures from every PDF part. -
Generate Embeddings: Use a multimodal mannequin to encode textual content and pictures.
-
Cosine Similarity for Search: Match person queries with doc embeddings based mostly on similarity scores.
Libraries: PyMuPDF
, sentence_transformers
, sklearn
Code Instance:
import fitz
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
mannequin = SentenceTransformer('clip-ViT-B-32')
def extract_text_and_images(pdf_path):
pdf_document = fitz.open(pdf_path)
chunks = []
for page_num in vary(len(pdf_document)):
web page = pdf_document[page_num]
chunks.append(web page.get_text("textual content")[:500])
for img in web page.get_images(full=True):
chunks.append("image_placeholder")
pdf_document.shut()
return chunks
def search_query(question, paperwork):
query_embedding = mannequin.encode(question)
doc_embeddings = mannequin.encode(paperwork)
similarities = cosine_similarity([query_embedding], doc_embeddings)
return similarities
pdf_path = "path/to/doc.pdf"
document_chunks = extract_text_and_images(pdf_path)
similarities = search_query("Person's search question right here", document_chunks)
print("Prime matching sections:", similarities.argsort()[::-1][:3])
This multimodal search device makes it simpler to sift by advanced paperwork by combining textual content and visible data right into a shared search index.
5. Superior Doc QA System (Superior)
Constructing on the earlier mission, this method permits customers to ask questions on paperwork and get concise solutions. We use doc embeddings to search out related data and a person interface to make it interactive.
Steps to Implement:
-
Chunk and Embed: Extract and embed every doc’s content material.
-
Create Search + QA System: Use embeddings for search and combine with OpenAI’s API for question-answering.
-
Construct an Interface with Gradio: Arrange a easy Gradio UI for customers to enter queries and obtain solutions.
Libraries: PyMuPDF
, sentence_transformers
, openai
, gradio
Code Instance:
import gradio as gr
import openai
from sentence_transformers import SentenceTransformer
mannequin = SentenceTransformer("all-MiniLM-L6-v2")
def generate_response(message, historical past):
response = openai.Completion.create(
mannequin="gpt-3.5-turbo",
messages=[{"role": "user", "content": message}]
)
return response.decisions[0].textual content
demo = gr.ChatInterface(
fn=generate_response,
examples=[{"text": "Explain this document section"}]
)
demo.launch()
This interactive QA system, utilizing Gradio, brings conversational AI to paperwork, enabling customers to ask questions and obtain related solutions.
These weekend AI tasks provide sensible purposes for various talent ranges. From resume optimization to superior doc QA, these tasks empower you to construct AI options that clear up on a regular basis issues, sharpen your expertise, and create spectacular additions to your portfolio.