Links
Comment on page

Scraping with Python

Build your scraper from zero with ChatGPT
You have many ways to do it with ChatGPT, but for this example we won't ask for a complete code since the beginning, we'll do it step by step. Let's start!

1. Asking for the base code

First, we'll just ask for a small piece of our scraper and in the next steps we will start to improve it also using ChatGPT. The first function we'll create will be for extract the URLs present in the first website we visit.

1.1 Prompt

I want to create a scraper using Python. The first thing I need to do is to create
a function that let me visit a website and extract its URLs

1.2 ChatGPT Response

ChatGPT giving base code for scraper made in Python
As we can see, we will have to install BeautifulSoup and requests:
$ pip install beautifulsoup4 requests

2. Asking for new features

When I ran this code with the url https://docs.gpt4devs.com I found many urls starting with "/" so I can't visit them automatically, because they're invalid. Let's say to ChatGPT that we need to fix the URL when it's starting with a "/"

2.1 Prompt

It works perfect, but I have a problem. Many URLs that it detects are starting
with "/" and not with the domain, so I can't visit them easy. I need you
fix it please

2.2 ChatGPT Response

ChatGPT fixing core function of our scraper

3. Extract Title and URL

We will start to extract some information of each URL. For this example I just wanted to extract the title, but you can ask for extract other tags or information

3.1 Prompt

Perfect! Now I need to add another feature: I need to visit the URLs it finds
recursively and extract the title of each one

3.2 ChatGPT Response

Adding feature to our scraper made with ChatGPT and Python